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REMARKS 

The specification has been amended to capitaUze trademarks and remove reference to 
embedded hyperlinks. 

Applicants have cancelled Claims 1-3, 7-10 and 15 without prejudice to, or disclaimer of, 
the subject matter contained therein. Applicants maintain that the cancellation of a claim makes 
no admission as to its patentability and reserve the right to pursue the subject matter of the 
cancelled claim in this or any other patent application. 

Applicants have amended Claims 4, 5, 6 and 14 to delete elements (a)-(d). Claims 4 and 
5 are amended to include the limitation 'Svherein said nucleic acid is more highly expressed in 
esophageal tumor and kidney tumor tissue compared to normal esophageal and normal kidney 
tissue." Claims 11 and 12 are amended to remove informalities. Claim 14 is amended to 
indicate that the isolated nucleic acid hybridizes under stringent conditions, and recites the 
stringent conditions. Claim 14 also is amended to include "or a complement thereof to amended 
elements (a)-(c), and the following text "wherein said isolated nucleic acid molecule is suitable 
for use as a PCR primer, or probe; and wherein said isolated nucleic acid is at least about 450 
nucleotides in length." Claim 16 is amended to read "at least about 500 nucleotides in length." 
Claim 17 is amended to depend from Claim 4. Claim 19 is amended to indicate that the cell is an 
isolated cell. New Claims 21-31 have been added. 

Applicants submit that no new matter has been added by the amendments, and that 
support for the amendments can be found throughout the specification. For example, support for 
the amendment to Claims 4 and 5 regarding differential expression in esophageal tumor and 
kidney tumor can be found in Example 18 beginning at paragraph [0529], as well as paragraph 
[0336] of the specification. Support for the amendments to Claim 14 can be found, for example, 
at paragraphs [0012], [0317], and [0327] of the specification. Support for the amendment to 
Claim 16 and new Claims 21-25 can be found, for example, at paragraph [0012], Support for 
new Claims 26-31 can be found, for example, in the claims as originally filed, and paragraphs 
[0227] and [0317]. 

The rejections of the presently pending claims are respectfially traversed. Claims 4-6, 1 1- 
14, and 16-31 are presented for examination. 
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Correction of Inventorship under 37 CFR S1.48(b) 

Applicants request that several inventors be deleted, as these inventors' inventions are no 
longer being claimed in the present application as a result of prosecution. The fee as set forth in 
§ 1.1 7(i) is submitted herewith. 

Priority Determination: 

As an initial matter, the PTO issued the instant Office Action assuming that the earliest 
priority is the instant filing date, May 8, 2002. The PTO argued that the instant application and 
priority application Serial No. 10/006,867 do not meet the requirements of 35 U.S.C, § 112, first 
paragraph. However, for the reasons set forth below, the instant application and the priority 
application do meet the requirements of 35 U.S.C. § 112, first paragraph, and therefore, are 
entitled to an earlier priority date. 

AppUcants have previously Usted the priority information for the instant appUcation in a 
Preliminary Amendment mailed September 5, 2002. The preliminary amendment states that the 
instant "application is a continuation of, and claims priority under 35 U.S.C. § 120 to, US 
Application 10/006867 filed 12/6/2001, which is a continuation of, and claims priority under 35 
U.S.C. § 120 to, PCT Application PCT/USOO/23328 filed 8/24/2000, which is a continuation-in- 
part of, and claims priority under 35 U.S.C. § 120 to, US Application 09/403297 filed 
10/18/1999, now abandoned, which is the National Stage filed under 35 U.S.C. § 371 of PCT 
Application PCT/US99/20111 filed 9/1/1999, which claims priority under 35 U.S.C. § 119 to 
U.S. Provisional Application 60/105881 filed 10/27/1998." 

The sequences of SEQ ID NOs: 81 and 82 were first disclosed in U.S. Provisional 
Application 60/105,881 filed 10/27/1998 as SEQ ID N0:1 and 2 and in Figures 1 and 2. These 
same sequences were disclosed in PCT/US99/20111 and in 09/403,297 as SEQ ID NO: 141 and 
142, Figures 85 and 86. The data in Example 18 (Tumor Versus Normal Differential Tissue 
Expression Distribution), relied on in part for the utiHty of the claimed nucleic acids, were first 
disclosed in PCT Application PCT/USOO/23328 filed 8/24/2000, on page 93, line 3, through page 
96, line 35. Thus, Applicants maintain that the present application is fully entitled to the benefit 
of at least the priority date of August 24, 2000. 
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Rejection under 35 U.S.C, SlOl - Utility 

The PTO has rejected Claims 1-20 as lacking a specific, substantial, and credible utility. 
The PTO argues that utilities asserted in the specification are not specific and substantial or well 
established. According to the PTO, "[t]he encoding nucleic acid cannot derive a utility fi-om the 
encoded polypeptide because there is neither a known physiological or cUnical significance of the 
[encoded] polypeptide, and the prior art does not support a very close structural relationship to a 
well described (structurally and functionally) family of known proteins." Office Action at 2. 

The PTO cites Hu et al (J. Proteome Res., 2(4):405-12 (2003)) and Wu et al (Gene 
311:105-110 (2003)) to support its assertion that the literature cautions against drawing 
conclusions based on small changes in transcript expression levels between normal and 
cancerous tissue, and that upregulation of a gene does not necessarily indicate that the tumor is 
vascularized. 

One of the asserted utilities for the claimed nucleic acids is use as a diagnostic tool, as 
well as therapeutically as a target for treatment, based on the data that PR01557 cDNA is more 
highly expressed in esophageal tumor and kidney tumor as compared to normal esophagus and 
normal kidney tissue, respectively. The PTO recognizes this as a "possible utility," however, the 
PTO asserts that there is no guidance on how to use this information, that no levels are disclosed, 
and that the information is too sparse to allow the encoding polynucleotide to be used as a 
diagnostic marker for esophageal or kidney tumor. 

The PTO also argues that even if the polynucleotide has utility as a tumor marker, there is 
no such utility for the polypeptide because there is no reason to suspect that there is an alteration 
in the amount of the polypeptide in normal esophagus and kidney tissue compared to esophageal 
and kidney tumor tissue. For the above reasons, the PTO asserts that there is no substantial and 
specific utility for the nucleic acid of SEQ ID N0:81. 

Applicants respectfiiUy disagree and submit that for the reasons stated below, the claimed 
nucleic acids have a credible, substantial, and specific utility. 

Utility - Legal Standard 

According to the Utility Examination Guidelines ("Utility Guidelines"), 66 Fed. Reg. 
1092 (2001) an invention complies with the utility requirement of 35 U.S.C. § 101, if it has at 
least one asserted "specific, substantial, and credible utility" or a "well-established utility." 
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Under the Utility Guidelines, a utility is "specific" when it is particular to the subject 
matter claimed. For example, it is generally not enough to state that a nucleic acid is useful as a 
diagnostic tool without also identifying the condition that is to be diagnosed. 

The requirement of "substantial utility" defines a "real world" use, and derives from the 
Supreme Court's holding in Brenner v. Manson, 383 U.S. 519, 534 (1966) stating that "The basic 
quid pro quo contemplated by the Constitution and the Congress for granting a patent monopoly 
is the benefit derived by the public from an invention with substantial utility." In explaining the 
"substantial utility" standard, M.P.E.P. § 2107.01 cautions, however, that Office personnel must 
be careful not to interpret the phrase "immediate benefit to the public" or similar formulations 
used in certain court decisions to mean that products or services based on the claimed invention 
must be "currently available" to the public in order to satisfy the utility requirement. "Rather, any 
reasonable use that an applicant has identified for the invention that can be viewed as providing 
a public benefit should be accepted as sufficient, at least with regard to defining a 'substantial' 
utility." (M.P.E.P. § 2107.01, emphasis added). 

The mere consideration that further experimentation might be performed to more fully 
develop the claimed subject matter does not support a finding of lack of utility. M.P.E.P. § 
2107.01 m cites In reBrana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995) in stating 
that "Usefulness in patent law ... necessarily includes the expectation of further research and 
development. The stage at which an invention in this field becomes useful is well before it is 
ready to be administered to humans." Further, "[T]o violate § 101 the claimed device must be 
totally incapable of achieving a useful result" Juicy Whip Inc. v. Orange Bang Inc., 51 USPQ2d 
1700 (Fed. Cir. 1999), citing Brooktree Corp. v. Advanced Micro Devices, Inc., 977 F.2d 1555, 
1571 (Fed.Cir.1992). 

Indeed, the Guidelines for Examination of Applications for Compliance With the Utility 
Requirement, set forth in M.P.E.P. § 2107 11(B)(1) gives the following instruction to patent 
examiners: "If the applicant has asserted that the claimed invention is useful for any particular 
practical purpose . . . and the assertion would be considered credible by a person of ordinary skill 
in the art, do not impose a rejection based on lack of utility." 
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Utility need NOT be Proved to a Statistical Certainty - a Reasonable Correlation between the 

Evidence and the Asserted Utility is Sufficient 

An Applicant's assertion of utility creates a presumption of utility that will be sufficient to 

satisfy the utility requirement of 35 U.S.C. § 101, "unless there is a reason for one skilled in the 

art to question the objective truth of the statement of utility or its scope." In re Langer, 503 F.2d 

1380, 1391, 183 USPQ 288, 297 (CCPA 1974). See, also In re Jolles, 628 F.2d 1322, 206 USPQ 

885 (CCPA 1980); In re Irons, 340 F.2d 974, 144 USPQ 351 (1965); In re Sichert, 566 F.2d 

1154, 1159, 196 USPQ 209, 212-13 (CCPA 1977). Compliance with 35 U.S.C. § 101 is a 

question of fact. Raytheon v. Roper, 724 F.2d 951, 956, 220 USPQ 592, 596 (Fed. Cir. 1983) 

cert, denied, 469 US 835 (1984). The evidentiary standard to be used throughout ex parte 

examination in setting forth a rejection is a preponderance of the evidence, or "more likely than 

not" standard. In re Oetiker, 977 F.2d 1443, 1445, 24 USPQ2d 1443, 1444 (Fed. Cir. 1992). 

This is stated explicitly in the M.P.E.P.: 

[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a 
whole, it leads a person of ordinary skill in the art to conclude that the asserted 
utility is more likelv than not true . M.P.E.P. at § 2107.02, part VII (2004) 
(underline emphasis in original, bold emphasis added, intemal citations omitted). 

The PTO has the initial burden to offer evidence "that one of ordinary skill in the art 
would reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 
1436 (Fed. Cir. 1995). Only then does the burden shift to the Applicant to provide rebuttal 
evidence. Id, As stated in the M.P.E.P., such rebuttal evidence does not need to absolutely prove 
that the asserted utility is real. Rather, the evidence only needs to be reasonably indicative of the 
asserted utility. 

In Fujikawa v. Wattanasin, 93 F.3d 1559, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996), the Court 

of Appeals for the Federal Circuit upheld a PTO decision that in vitro testing of a novel 

pharmaceutical compound was sufficient to establish practical utility, stating the following rule: 

[T]esting is often required to establish practical utility. But the test results need 
not absolutely prove that the compound is pharmacologically active. All that is 
required is that the tests be ''reasonably indicative of the desired 
[pharmacological] response." In other words, there must be a sufficient 
correlation between the tests and an asserted pharmacological activity so as to 
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convince those skilled in the art, to a reasonable probability, that the novel 
compound will exhibit the asserted pharmacological behavior." Fujikawa v. 
Wattanasin, 93 F.3d 1559, 1564, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996) (intemal 
citations omitted, bold emphasis added, italics in original). 

While the Fujikawa case was in the context of utility for pharmaceutical compounds, the 
principals stated by the Court are applicable in the instant case where the asserted utility is for a 
therapeutic and diagnostic use - utility does not have to be established to an absolute certainty, 
rather, the evidence must convince a person of skill in the art "to a reasonable probability." In 
addition, the evidence need not be direct, so long as there is a "sufficient correlation" between 
the tests performed and the asserted utility. 

Thus, the legal standard for demonstrating utility is a relatively low hurdle. An Applicant 
need only provide evidence such that it is more likely than not that a person of skill in the art 
would be convinced, to a reasonable probability, that the asserted utility is true. The 
evidence need not be direct evidence, so long as there is a reasonable correlation between the 
evidence and the asserted utiUty. The Applicant does not need to provide evidence such that it 
establishes an asserted utility as a matter of statistical certainty. 

Even assuming that the PTO has met its initial burden to offer evidence that one of 
ordinary skill in the art would reasonably doubt the truth of the asserted utility, AppUcants assert 
that they have met their burden of providmg rebuttal evidence such that it is more likely than not 
those skilled in the art, to a reasonable probability, would believe that the claimed invention is 
useful as a diagnostic tool for cancer. 

Substantial Utility 

Summary o f Applicants ' Arguments and the PTO 's Response 

In an attempt to clarify Applicants' argument, Applicants offer a summary of their 
argument and the disputed issues involved. Applicants assert that the claimed nucleic acids have 
utility as diagnostic tools for cancer, particularly esophageal and kidney cancer. Applicants' 
asserted utility rests on the following argument: 

1. AppUcants assert they have provided rehable evidence that mRNA for the PR01557 
polypeptide is expressed at least two-fold higher in esophageal tumor and kidney tumor 
compared to normal esophageal and kidney tissue, and therefore the claimed nucleic acids are 

-17- 



Appl. No. : 10/063,713 

Filed : May 8, 2002 

useful as diagnostic tools. Applicants are not asserting that the claimed nucleic acids will 
necessarily provide a definitive diagnosis of cancer, but rather that they are useful, alone or in 
combination with other diagnostic tools to assist in the diagnosis of certain cancers. 

2. Applicants submit that it is not necessary to know what role the PR01557 gene plays 
in cancer to use its differential expression as a diagnostic tool. 

3. It is not required to prove that the PRO 1557 polypeptide is also differentially 
expressed in certain tumors to establish the utiUty of the claimed nucleic acids. 

Apphcants understand the PTO to be making several arguments in response to 
Applicants' asserted utility: 

1. The PTO has challenged the reliabiHty of the evidence reported in Example 18, and 
states that it does not provide the expression levels, and that the information is too sparse to 
allow the encoding polynucleotide to be used as a diagnostic marker for tumors; 

2. The PTO cites Hu et al and Wu et al for the assertion that the Uterature cautions 
against drawing conclusions based on small changes in transcript expression levels, and that 
upregulation of a gene does not necessarily indicate that the tumor is vascularized; 

3 The PTO asserts that the nucleic acid cannot derive utility from the encoded 
polypeptide because there is no known physiological or clinical significance of the polypeptide 
and because there is no reason to think that there is alteration of encoded polypeptide in 
esophageal tumor or kidney tumor relative to normal esophageal or kidney tissue. 

As detailed below, Applicants submit that the PTO has failed to meet its initial burden to 
offer evidence 'that one of ordinary skill in the art would reasonably doubt the asserted utility." 
In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995). First, Applicants submit 
herewith a copy of a declaration of J. Christopher Grimaldi, (attached as Exhibit 1) which 
estabUshes the reliability of the data of Example 18. Second, the references provided by the PTO 
not contrary to Applicants' argimients and evidence, and therefore are not evidence to support the 
PTO's position. Third, Applicants submit that given the well-established correlation between a 
change in the level of mRNA with a corresponding change in the levels of the encoded protein, 
the PR01557 protein is likely differentially expressed in certain tumors. However, utility for the 
pending claims does not rely on whether the encoded polypeptide is overexpressed, and as such 
whether or not increased levels of PR01557 mRNA correlate with increased levels of PR01557 
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protein is not presently an issue. Fourth, Applicants do not rely on the function of the encoded 
polypeptides for utility for the claimed nucleic acids. 

Finally, even if the PTO has met its initial burden. Applicants have submitted enough 
rebuttal evidence such that it is more likely than not that a person of skill in the art would be 
convinced, to a reasonable probability, that the asserted utility is true. As stated above. 
Applicants' evidence need not be direct evidence, so long as there is a reasonable correlation 
between the evidence and the asserted utility. The standard is not absolute or statistical 
certainty. 

Applicants have established that the Gene Encoding the PRO 15 57 Polypeptide is Differentially 
Expressed in Certain Cancers compared to Normal Tissue and is Useful as a Diagnostic Tool 

Applicants first address the PTO's argument that the evidence of differential expression 
of the gene encoding the PR01557 polypeptide in certain tumors compared to their normal 
counterparts is insufficient because the specification provides no information regarding values of 
the differences in transcript levels, and the disclosure of the specification is too sparse. 
Applicants also address the PTO's argument that the data do not establish a utility because the 
specification does not disclose any information on the level of expression, activity, or role of the 
PR01557 polypeptide in cancer. Applicants submit that the gene expression data provided in 
Example 18 of the present application are sufficient to establish a specific and substantial utihty 
for the claimed nucleic acids related to the gene encoding the PRO 1557 polypeptide. 

AppUcants submit herewith a copy of a declaration of J. Christopher Grimaldi, an expert 
in the field of cancer biology, originally submitted in a related co-pending and co-owned patent 
application Serial No. 10/063,557 (attached as Exhibit 1). In paragraphs 6 and 7, Mr. Grimaldi 
explains that the semi-quantitative analysis employed to generate the data of Example 18 is 
sufficient to determine if a gene is over- or underexpressed in tumor cells compared to 
corresponding normal tissue. He states that any visually detectable difference seen between two 
samples is indicative of at least a two-fold difference in cDNA between the tumor tissue and the 
counterpart normal tissue. Thus, the results of Example 18 reflect at least a two-fold difference 
between normal and tumor samples. 

He also states that the results of the gene expression studies indicate that the genes of 
interest "can be used to differentiate tumor from normal," thus establishing their reliability. He 
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explains that, contrary to the PTO's assertions, "The precise levels of gene expression are 
irrelevant; what matters is that there is a relative difference in expression between normal tissue 
and tumor tissue." (Paragraph 7). Thus, since it is the relative level of expression between 
normal tissue and suspected cancerous tissue that is important, the precise level of expression in 
normal tissue is irrelevant. Likewise, there is no need for quantitative data to compare the level 
of expression in normal and tumor tissue. As Mr. Grimaldi states, "If a difference is detected, 
this indicates that the gene and its corresponding polypeptide and antibodies against the 
polypeptide are useful for diagnostic purposes, to screen samples to differentiate between normal 
and tumor." 

Applicants submit that a lack of known role for the gene encoding PR01557 in cancer 
does not prevent its use as a diagnostic tool for cancer. Whether the differential expression of the 
gene encoding PRO 1557 is a cause or result of the esophageal and kidney tumors is irrelevant to 
whether its differential expression can be used to assist in diagnosis of cancer - one does not 
need to know why the PRO 1557 gene is differentially expressed, or what the consequence of the 
differential expression is, in order to exploit the differential expression to distinguish tumor from 
normal tissue. 

The PTO has recognized that the utility of a nucleic acid does not depend on the function 
of the encoded gene product. The Utility Examination Guidelines published on January 5, 2001 
state "In addition, the utility of a claimed DNA does not necessarily depend on the function of 
the encoded gene product. A claimed DNA may have a specific and substantial utihty because, 
e.g. it hybridizes near a disease-associated gene or it has a gene regulating activity." (Federal 
Register, Volume 66, page 1095, Comment 14). While Applicants appreciate that actions taken 
in other applications are not binding on the PTO with respect to the present application. 
Applicants note that the PTO issues patents relating to nucleic acids which are useful for 
diagnosing particular conditions regardless of whether the nucleic acids are the causative agent 
for the condition. For example, polymorphisms which are indicative of a predisposition to a 
particular condition are patentable (see, e.g., U.S. Patent No. 6,465,185, U.S. Patent No. 
6,228,582, and U.S. Patent No.6,162,604 submitted herewith as Exhibits 2-4), even though they 
may or may not cause the disease itself Similarly, the present nucleic acids which are useful for 
determining whether an individual has cancer are useful regardless of whether or not they are the 
cause of the cancer. 
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The PTO relies on two references to support its assertion that the hterature cautions 
researchers from drawing conclusions based on small changes in transcript expression levels 
between normal and cancerous tissue. The PTO cites Hu et al (J. Proteome Res., 2(4):405-12 
(2003)) for support for the conclusion that not all genes with increased expression in cancer have 
a known or pubHshed role in cancer. The PTO cites Wu et al (Gene 311:105-110 (2003)) to 
support its assertion that upregulation of a gene does not necessarily indicate that the tumor is 
vascularized. Applicants respectfully submit that these references do not satisfy the PTO's 
burden to offer evidence that one of ordinary skill in the art would reasonably doubt the truth of 
the asserted utility. 

In Hu, the researchers used an automated literature-mining tool to summarize and 
estimate the relative strengths of all human gene-disease relationships published on Medline. 
They then generated a microarray expression dataset comparing breast cancer and normal breast 
tissue. Using their data-mining tool, they looked for a correlation between the strength of the 
literature association between the gene and breast cancer, and the magnitude of the difference in 
expression level. They report that for genes displaying a 5-fold change or less in tumors 
compared to normal, there was no evidence of a correlation between altered gene expression and 
a known role in the disease. See Hu at 411. However, among genes with a 10-fold or more 
change in expression level, there was a strong correlation between expression level and a 
published role in the disease. Id. at 412. Importantly, Hu reports that the observed correlation 
was only found among estrogen receptor-positive tiunors, not less-prevalent ER-negative tumors. 
Id 

The general findings of Hu are not surprising - one would expect that genes with the 
greatest change in expression in a disease would be the first targets of research, and therefore 
have the strongest known relationship to the disease as measured by the number of publications 
reporting a connection with the disease. The correlation reported in Hu only indicates that the 
greater the change in expression level, the more likely it is that there is a published or known role 
for the gene in the disease, as found by their automated literature-mining software. Thus, Hu's 
results merely reflect a bias in the literature toward studying the most prominent targets, and 
reflect nothing regarding the ability of a gene that is 2-fold or more differentially expressed in 
tumors to serve as a disease marker. Hu acknowledges the shortcomings of this method in 
explaining the disparity in Hu's findings for ER-negative versus ER-positive tumors: Hu 
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attributes the "bias in the Hterature" toward the more prevalent ER-positive tumors as the 
explanation for the lack of any correlation between number of publications and gene expression 
levels in less-prevalent (and, therefore, less studied) ER-negative tumors. Id. Because of this 
intrinsic bias, Hu's methodology is unlikely to ever note a correlation of a disease with less 
differentially-expressed genes and their corresponding proteins, regardless of whether or not an 
actual relationship between the disease and less differentially-expressed genes exists. 
Accordingly, Hu's methodology yields results that provide little or no information regarding 
biological significance of genes with less than 5 -fold expression change in disease. 

Applicants submit that a lack of known role for PR01557 in cancer does not prevent its 
use as a diagnostic tool for cancer. There is a difference between use of a gene for distinguishing 
between tumor and normal tissue on the one hand, and establishing a role for the gene in cancer 
on the other. Genes with lower levels of change in expression may or may not be the most 
important genes in causing the disease, but the genes can still show a consistent and measurable 
change in expression. While such genes may or may not be good targets for further research, 
they can nonetheless be used as diagnostic tools. Thus, Hu does not refute the Applicants' 
assertion that the PRO 1557 gene can be used as a cancer diagnostic tool because it is 
differentially expressed in certain tumors. 

The PTO also cites Wu et al (Gene 311:105-110 (2003)) as support for the PTO's 
assertion that upregulation of a gene does not necessarily indicate that the tumor is vascularized. 
Wu et al identify a gene, BNF-1, as a putative extracellular matrix protein over-expressed in 
breast, lung and colon tumors, which were the only tumors tested. Wu found that in 3 out of 1 1 
breast tumor samples, BNF-1 was up-regulated about 2-fold to 3-fold. Wu at 107. Wu found 
that BNF-1 was up-regulated about 2-fold to 3 -fold in 2 out of 6 lung tumor samples. Id, at 109. 
Wu found that BNF-1 was up-regulated about 2-fold to about 4-fold in 1 out of 6 colon tumor 
samples. Id, The coding region of BNF-1 is identical to the coding region of SEQ ID N0:81. 
Thus, Wu demonstrates that a gene identical to that of Applicants claims is over-expressed by 2- 
fold to 4-fold in some tumor samples, and Wu concludes that this gene is up-regulated in tumors. 
Wu also notes that the commercially provided tumor samples did not indicate the vascular state 
of the source tumors. Wu states that the relationship between up-regulation of BNF-1 and tumor 
vascularization is not determined in this study. 
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The PTO cites Wu for the proposition that even if a gene is up-regulated in a tumor, this 
does not serve as an indication that the tumor is vascularized or malignant. AppHcants do not 
disagree, but Applicants maintain that this position is irrelevant for the purposes of determining 
the utility of a gene. Applicants are not asserting that the claimed nucleic acids will necessarily 
provide a definitive diagnosis of vascularized cancer, but rather that the claimed nucleic acids are 
useful, alone or in combination with other diagnostic tools to assist in the diagnosis of certain 
cancers. 

The PTO appears to require that, in order for gene asserted to be a tumor marker to have 
any utility whatsoever, there must be a demonstration that the gene in question is up-regulated in 
vascularized tumors. This position of the PTO is inconsistent with the analogous standard for 
therapeutic utility of a compound that 'the mere identification of a pharmacological activity of a 
compound that is relevant to an asserted pharmacological use provides an 'immediate benefit to 
the public' and thus satisfies the utility requirement." M.P.E.P. §2701.01 (emphasis original). 
Here, the mere identification of altered expression in tumors is relevant to diagnosis of tumors, 
and, therefore, provides an immediate benefit to the public. The position of the PTO is also 
inconsistent with the statements of Wu itself Wu discusses that observations similar to that for 
BNF-l have been made for other solid tumor oncogenes such as N-MYC. Wu at 109. Thus, Wu 
indicates that BNF-l expression patterns is consistent with that observed for other oncogenes. 
Accordingly, far fi*om serving as evidence of a lack of utility for the claimed nucleic acids, Wu 
supports their utility because BNF-l, which Wu states has expression pattems consistent with 
oncogenes, has the identical coding region to SEQ ID N0:81 recited in Applicants' claims. 

Moreover, Wu also serves as evidence contrary to the PTO's position that changes in 
expression level below 5-fold are insufficient to supply utility (for which the PTO relies on the 
Hu reference). Wu reports only increases in BNF-l expression of 2- fold to 4-fold. These results 
lead Wu to conclude that BNF-l is up-regulated in tumors and that the expression pattern for 
BNF-l is consistent with that of other solid tumor oncogenes. While Hu merely indicates that 
less differentially expressed genes are less-often the subjects of scientific publications, Wu 
asserts that the 2-fold to 4-fold overexpressed BNF-l gene is consistent with other soHd tumor 
oncogenes. Thus, the teachings of Wu toward the utility of BNF-l, and similarly up-regulated 
oncogenes in general, are more applicable to the question of the utility of Applicants' claimed 
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nucleic acids than the teachings of Hu. Accordingly, the evidence presented by the PTO, as a 
whole, supports Applicants' assertion of utility of the claimed nucleic acids. 

As stated above, the standard for utility is not absolute certainty, but rather whether one 
of skill in the art would be more likely than not to beUeve the asserted utility. Hu and Wu are not 
sufficient to prove that a person of skill in the art would consider it unlikely that a gene 
differentially expressed in certain tumors can be used as a diagnostic tool since neither reference 
teaches against this, and Wu supports Applicants' asserted utility of the claimed nucleic acids. 
Given the lack of support for the PTO's position, and the supporting evidence provided by the 
PTO and Applicants for Applicants' position, one of skill in the art would be more likely than not 
to believe that the claimed nucleic acids related to PRO! 557 gene can be used as diagnostic tools 
for cancer, particularly esophageal and kidney cancer. 

In conclusion, Applicants submit that the evidence reported in Example 18, combined 
with the first Grimaldi Declaration, estabhsh that there is at least a two-fold difference in 
PR01557 cDNA between esophageal tumor tissue and kidney tumor tissue, and normal 
esophageal tissue and normal kidney tissue, respectively. Therefore, it follows that expression 
levels of the PR01557 gene can be used to distinguish esophageal tumor tissue from normal 
esophageal tissue and kidney tumor tissue fi-om normal kidney tissue. The evidence offered by 
the PTO supports Applicants asserted utility without supporting any significant argument to the 
contrary. Applicants have therefore established a utility for the claimed nucleic acids as 
diagnostic tools for cancer, particularly esophageal and kidney tumors. 

A pplicants have established that the Accepted Understanding in the Art is that there is a Positive 
Correlation between mRNA Levels and the Level of Expression of the Encoded Protein 

Apphcants have asserted that there is a direct correlation between changes in the level of 
mRNA and changes in the level of expression of the corresponding protein. Because the claims 
have been amended such that the claimed nucleic acids are not defined by the sequence of the 
polypeptide they encode, the question of whether there is a correlation between changes in gene 
expression and changes in protein expression are not presently at issue . However, Applicants 
submit that they have established for the record that it is well-established in the art that a change 
in the level of mRNA for a particular protein, generally leads to a corresponding change in the 
level of the encoded protein. Given Applicants' evidence of differential expression of the 
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mRNA for the PRO 15 57 polypeptide in esophageal and kidney tissue, it is more likely than not 
that the PR01557 polypeptide is also differentially expressed. 

The PTO states that even if the claimed nucleic acid had utility, the "encoded polypeptide 
would have no such utility since there is no reason to suspect that there is alteration of 
polypeptide sequence or amount in esophageal or kidney tumor versus normal tissue." Office 
action at 4 (emphasis original). No substantiating evidence is presented. This statement in the 
Office Action does not satisfy the PTO's burden to offer evidence that one of ordinary skill in the 
art would reasonably doubt the truth of the asserted utility. As stated above, the standard for 
establishing a use for a claimed invention is not absolute or even statistical certainty, and thus a 
necessary correlation between mRNA levels and protein levels is not required. 

The PTO cites no evidence that would cast any doubt on the Applicants assertion that in 
general, there is a positive correlation between changes in mRNA level and changes in the 
encoded protein level. 

In further support of the assertion that changes in mRNA are positively correlated to 
changes in protein levels. Applicants submit herewith a copy of a second Declaration by J. 
Christopher Grimaldi, an expert in the field of cancer biology (attached as Exhibit 5). This 
declaration was submitted in connection with the related co-pending and co-owned application 
Serial No. 10/063,557. As stated in paragraph 5 of the declaration, "Those who work in this field 
are well aware that in the vast majority of cases, when a gene is over-expressed... the gene 
product or polypeptide will also be over-expressed. ... This same principal applies to gene under- 
expression." Further, "the detection of increased mRNA expression is expected to result in 
increased polypeptide expression, and the detection of decreased mRNA expression is expected 
to resuh in decreased polypeptide expression. The detection of increased or decreased 
polypeptide expression can be used for cancer diagnosis and treatment." The references cited in 
the declaration and submitted herewith support this statement. 

Applicants also submit herewith a copy of the declaration of Paul Polakis, Ph.D. (attached 
as Exhibit 6), an expert in the field of cancer biology, originally submitted in a related and co- 
owned patent application Serial No. 10/032,996. As stated in paragraph 6 of his declaration: 

Based on my own experience accumulated in more than 20 years of research, 
including the data discussed in paragraphs 4 and 5 above [showing a positive 
correlation between mRNA levels and encoded protein levels in the vast majority 
of cases] and my knowledge of the relevant scientific literature, it is my 
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considered scientific opinion that for human genes, an increased level of mRNA 
in a tumor cell relative to a normal cell typically correlates to a similar increase in 
abundance of the encoded protein in the tumor cell relative to the normal cell. In 
fact, it remains a central dogma in molecular biology that increased mRNA levels 
are predictive of corresponding increased levels of the encoded protein, 
(Emphasis added). 

Dr. Polakis acknowledges that there are published cases where such a correlation does not exist, 
but states that it is his opinion, based on over 20 years of scientific research, that "such reports 
are exceptions to the commonly understood general rule that increased mRNA levels are 
predictive of corresponding increased levels of the encoded protein." (Polakis Declaration, 
paragraph 6). 

The statements of Grimaldi and Polakis are supported by the teachings in Molecular 
Biology of the Cell, a leading textbook in the field (Bruce Alberts, et aL, Molecular Biology of 
the Cell (3'^ ed. 1994) (submitted herewith as Exhibit 7) and (4* ed. 2002) (submitted herewith 
as Exhibit 8)). Figure 9-2 of Exhibit 7 shows the steps at which eukaryotic gene expression can 
be controlled. The furst step depicted is transcriptional control. Exhibit 7 provides that "[f]or 
most genes transcriptional controls are paramount. This makes sense because, of all the possible 
control points illustrated in Figure 9-2, only transcriptional control ensures that no superfluous 
intermediates are synthesized." Exhibit 7 at 403 (emphasis added). In addition, the text states 
that "Although controls on the initiation of gene transcription are the predominant form of 
regulation for most genes , other controls can act later in the pathway from RNA to protein to 
modulate the amount of gene product that is made." Exhibit 7 at 453 (emphasis added). Thus, as 
estabUshed in Exhibit 7, the predominant mechanism for regulating the amount of protein 
produced is by regulating transcription initiation. 

In Exhibit 8, Figure 6-3 on page 302 illustrates the basic principle that there is a 
correlation between increased gene expression and increased protein expression. The 
accompanying text states that "a cell can change (or regulate) the expression of each of its genes 
according to the needs of the moment - most obviously by controlling the production of its 
mRNAr Exhibit 8 at 302 (emphasis added). Similarly, Figure 6-90 on page 364 of Exhibit 8 
illustrates the path from gene to protein. The accompanying text states that while potentially 
each step can be regulated by the cell, "the initiation of transcription is the most common point 
for a cell to regulate the expression of each of its genes ." Exhibit 8 at 364 (emphasis added). 
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This point is repeated on page 379, where the authors state that of all the possible points for 
regulating protein expression, "[f|or most genes transcriptional controls are paramount ." Exhibit 
8 at 379 (emphasis added). 

Further support for Applicants' position can be found in the textbook, Genes VI, 
(Benjamin Lewin, Genes VI (1997)) (submitted herewith as Exhibit 9) which states "having 
acknowledged that control of gene expression can occur at multiple stages, and that production of 
RNA cannot inevitably be equated with production of protein, it is clear that the overwhelming 
majoritv of regulatory events occur at the initiation of transcription ." Genes VI at 847-848 
(emphasis added). 

Additional support is also found in Zhigang et al. World Joumal of Surgical Oncology 
2:13, 2004, submitted herewith as Exhibit 10. Zhigang studied the expression of prostate stem 
cell antigen (PSCA) protein and mRNA to validate it as a potential molecular target for diagnosis 
and treatment of human prostate cancer. The data showed "a high degree of correlation between 
PSCA protein and mRNA expression." Exhibit 10 at 4. Of the samples tested, 81 out of 87 
showed a high degree of correlation between mRNA expression and protein expression. The 
authors conclude that "it is demonstrated that PSCA protein and mRNA overexpressed in human 
prostate cancer, and that the increased protein level of PSCA was resulted from the upregulated 
transcription of its mRNA." Exhibit 10 at 6. Even though the correlation between mRNA 
expression and protein expression occurred in 93% of the samples tested, not 100%, the authors 
state that "PSCA may be a promising molecular marker for the clinical prognosis of human Pea 
and a valuable target for diagnosis and therapy of this tumor." Exhibit 10 at 7. 

Further, Meric et al. Molecular Cancer Therapeutics, vol. 1, 971-979 (2002), submitted 

herewith as Exhibit 1 1 , states the following: 

The fundamental principle of molecular therapeutics in cancer is to exploit the 
differences in gene expression between cancer cells and normal cells... [M]ost 
efforts have concentrated on identifying differences in gene expression at the level 
of mRNA, which can be attributable to either DNA amplification or to differences 
in transcription. Meric et al at 971 (emphasis added). 

Those of skill in the art would not be focusing on differences in gene expression between cancer 
cells and normal cells if there were no correlation between gene expression and protein 
expression. 
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As discussed above, whether or not increased levels of PR01557 mRNA correlate with 
increased levels of PR01557 protein is not presently an issue. However, AppUcants submit 
together, the declarations of Grimaldi and Polakis, the accompanying references, and the excerpts 
and references provided above all estabhsh that the accepted understanding in the art is that there 
is a reasonable correlation between changes in gene expression and the level of the encoded 
protein. In light of the lack of support for any argument by the PTO to the contrary, Applicants 
submit that they have established that it is more likely than not that one of skill in the art would 
believe that because the PR01557 mRNA is expressed at a higher level in esophageal tumor and 
kidney tumor compared to normal esophageal and normal kidney tissue, respectively, the 
PR01557 polypeptide will also be expressed at a higher level in esophageal tumor and kidney 
tumor compared to normal esophageal and normal kidney tissue, respectively. 

The Claimed Nucleic Acids would have Diagnostic Utility even i f there is no Direct Correlation 
between Gene Expression and Protein Expression 

Even assuming arguendo that, there is no direct correlation between changes in gene 
expression and changes in protein expression for PR01557, which Applicants submit is not true, 
nucleic acids related to a gene that is differentially expressed in cancer would still have a 
credible, specific and substantial utility. 

In paragraph 6 of the Grimaldi Declaration, Exhibit 5, Mr. Grimaldi explains that: 

However, even in the rare case where the protein expression does not correlate 
with the mRNA expression, this still provides significant information useful for 
cancer diagnosis and treatment. For example, if over- or under-expression of a 
gene product does not correlate with over- or under-expression of mRNA in 
certain tumor types but does so in others, then identification of both gene 
expression and protein expression enables more accurate tumor classification and 
hence better determination of suitable therapy. 

This conclusion is echoed in the Declaration of Avi Ashkenazi, Ph.D. (attached as 
Exhibit 12), an expert in the field of cancer biology. This declaration was previously submitted 
in connection with co-pending application Serial No. 09/903,925. Applicants submit that 
simultaneous testing of gene expression and gene product expression enables more accurate 
tumor classification, even if there is no positive correlation between the two. This leads to better 
determination of a suitable therapy. 
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This is fiirther supported by the teachings in the article by Hanna and Momin (attached as 
Exhibit 13). The article teaches that the HER-2/neu gene has been shown to be amplified and/or 
overexpressed in 10%-30% of invasive breast cancers and in 40-60% of intraductal breast 
carcinoma. Further, the article teaches that diagnosis of breast cancer includes testing both the 
amplification of the HER-2/neu gene (by FISH) as well as the overexpression of the HER-2/neu 
gene product (by IHC). Even when the protein is not overexpressed, the assay relying on both 
tests leads to a more accurate classification of the cancer and a more effective treatment of it. 

The Applicants have estabUshed that it is the general, accepted understanding in the art 
that there is a positive correlation between changes in gene expression and changes in protein 
expression. However, even when this is not the case, a gene that is differentially expressed in 
cancer would still have utility. Thus, Applicants have demonstrated another basis for supporting 
the asserted utility for the claimed nucleic acids. 

The Arguments made bv the PTO are Not Sufficient to satisfy the PTO's Initial Burden of 
Offering Evidence "that one of ordinary skill in the art would reasonably doubt the asserted 
utility" 

As stated above, an Applicant's assertion of utility creates a presumption of utility that 

will be sufficient to satisfy the utility requirement of 35 U.S.C. § 101, "unless there is a reason 

for one skilled in the art to question the objective truth of the statement of utility or its scope." In 

re Langer, 503 F.2d 1380, 1391, 183 USPQ 288, 297 (CCPA 1974). The evidentiary standard to 

be used throughout ex parte examination in setting forth a rejection is a preponderance of the 

evidence, or "more likely than not" standard. In re Oetiker, 977 F.2d 1443, 1445, 24 USPQ2d 

1443, 1444 (Fed. Cir. 1992). This is stated explicitly in the M.P.E.P.: 

[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a 
whole, it leads a person of ordinary skill in the art to conclude that the asserted 
utility is more likely than not true . M.P.E.P. at § 2107.02, part VII (2004) 
(underline emphasis in original, bold emphasis added, internal citations omitted). 

The PTO has the initial burden to offer evidence 'that one of ordinary skill in the art 

would reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 

1436 (Fed. Cir. 1995). Only then does the burden shift to the Applicant to provide rebuttal 
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evidence. Id, As stated in the M.P.E.P., such rebuttal evidence does not need to absolutelv prove 
that the asserted utility is real Rather, the evidence only needs to be reasonably indicative of the 
asserted utility. 

The PTO has not offered any arguments or cited any references to establish "that one of 
ordinary skill in the art would reasonably doubt" that a gene differentially expressed in certain 
tumors can be used as a diagnostic tool. Given the lack of support for the PTO's position, 
Applicants submit that the PTO has not met its initial burden of overcoming the presumption that 
the asserted utility is sufficient to satisfy the utility requirement. And even if the PTO has met 
that burden, the Applicants' supporting rebuttal evidence is sufficient to establish that one of skill 
in the art would be more likely than not to believe that the claimed nucleic acids can be used as 
diagnostic tools for cancer, particularly esophageal and kidney cancer. 

Specific Utility 

The Asserted Substantial Utilities are Specific to the Claimed Nucleic Acids 

Applicants next address the PTO's assertion that the asserted utilities are not specific to 
the claimed nucleic acids related to PR01557. Applicants respectfully disagree. 

Specific Utility is defined as utility which is "specific to the subject matter claimed," in 
contrast to "a general utility that would be applicable to the broad class of the invention." 
M.P.E.P. § 2107.01 I. Applicants submit that the evidence of differential expression of the 
PRO 1557 gene in certain types of cancer cells, along with the declarations and references 
discussed above, provide a specific utility for the claimed nucleic acids. 

As discussed above, there are significant data which show that the gene encoding the 
PRO 1557 polypeptide is more highly expressed in esophageal tumor tissue and kidney tumor 
tissue compared to normal esophageal tissue and normal kidney tissue, respectively. These data 
are strong evidence that the gene encoding the PR01557 polypeptide is associated with 
esophageal and kidney tumors. Thus, contrary to the assertions of the PTO, AppUcants submit 
that they have provided evidence associating the gene encoding PRO 1557 with two specific 
diseases. The asserted utiHty as a diagnostic tool for cancer, particularly esophageal tumor and 
kidney tumor, is a specific utility - it is not a general utility that would apply to the broad class of 
nucleic acids. 
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Conclusion 

The PTO has asserted three arguments for why there is a lack of a substantial utility: (1) 
the data reporting that the PR01557 gene is differentially expressed in certain tumors is not 
sufficient because there is not sufficient information regarding expression levels, and because the 
information is too sparse to allow the encoding polynucleotide to be used; (2) that the literature 
cautions researchers from drawing conclusions based on small changes in transcript expression 
levels between normal and cancerous tissue; and, (3) that because there is no necessary 
correlation between gene ampUfication and protein expression, the claimed nucleic acids cannot 
be used as cancer diagnostic or therapeutic tools. Applicants have addressed each of these 
arguments in tum. 

First, the Applicants provide a declaration stating that the data in Example 18 reporting 
higher expression of the PRO 1557 gene in esophageal tumor tissue and kidney tumor tissue 
compared to normal esophageal tissue and normal kidney tissue, respectively, are real and 
significant. This declaration also indicates that given the at least two-fold difference in 
expression levels, the claimed nucleic acids have utility as cancer diagnostic tools. Applicants 
have also shown that the precise level of expression and activity or role of the PR01557 
polypeptide or the gene that encodes it in cancer is irrelevant to the utility of the claimed subject 
matter. Resolution of these issues is not required to use the claimed nucleic acids as tumor 
diagnostic tools - one does not have to know why the PRO 1557 gene is differentially expressed 
in certain tumors to use it as a tumor marker. 

Second, Applicants have shown that the Hu and Wu references cited by the PTO do not 
teach that genes differentially expressed in cancer cannot be used as diagnostic tools. In fact, Wu 
supports Applicants' asserted utility and refutes the PTO's argument based on the Hu 
publication. 

Third, Applicants assert that whether the encoded polypeptide is also differentially 
expressed in certain tumors is currently not at issue in this application. However, Applicants 
believe that they have established that there is a reasonable correlation between changes in gene 
expression and corresponding changes in the level of the encoded protein. The PTO provides no 
evidence to the contrary. Applicants have presented the declarations of two experts in the field 
along with supporting references which establish that the general, accepted view of those of skill 
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in the art is that there is a direct correlation between changes in mRNA levels and the encoded 
protein levels. 

Applicants have also presented the declarations of two experts in the field, along with 
supporting references, which establish that even in the anomalous case where there is no positive 
correlation between gene expression and expression of the encoded protein, the simultaneous 
monitoring of both is useful for diagnosis and further classification of the cancer. 

Finally, the PTO asserts that there is no asserted specific utility. Applicants have pointed 
out that the substantial utilities described above are specific to the claimed nucleic acids because 
the gene encoding PR01557 is differentially expressed in certain cancer cells compared to the 
corresponding normal cells. This is not a general utility that would apply to the broad class of 
nucleic acids. 

Thus, given the totality of the evidence provided, Applicants submit that they have 

established a substantial, specific, and credible utility for the claimed nucleic acids as a 

diagnostic agent. According to the PTO Utility Examination Guidelines (2001), irrefutable proof 

of a claimed utility is not required. Rather, a specific, substantial, and credible utility requires 

only a "reasonable" confirmation of a real world context of use. Applicants remind the PTO that: 

A small degree of utility is sufficient . . . The claimed invention must only be 
capable of performing some beneficial fVmction ... An invention does not lack 
utility merely because the particular embodiment disclosed in the patent lacks 
perfection or performs crudely ... A commercially successfiil product is not 
required . . . Nor is it essential that the invention accomplish all its intended 
functions ... or operate imder all conditions . . . partial success being sufficient to 
demonstrate patentable utility ... In short, the defense of non-utility cannot be 
sustained without proof of total incapacity. If an invention is only partiallv 
successful in achieving a useful result, a rejection of the claimed invention as a 
whole based on a lack of utility is not appropriate. M.P.E.P. at 2107.01 (underline 
emphasis in original, bold emphasis added, citations omitted). 

Applicants submit that they have established that it is more likely than not that one of 
skill in the art would reasonably accept the utility for the claimed nucleic acids relating to 
PRO 1557 set forth in the specification. In view of the above. Applicants respectfully request that 
the PTO reconsider and withdraw the utility rejection under 35 U.S.C. §101. 
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Rejections under 35 U.S.C. S 112. first paraeraoh - Enablement 

The PTO rejected Claims 1-20 under 35 U.S.C. § 112, first paragraph, as containing 
subject matter which was not described in the specification in such a way as to enable one skilled 
in the art to use the invention. The PTO argues that because the claimed invention is not 
supported by a substantial, specific and credible utility, the claims are not enabled. The PTO 
states that the specification does not provide sufficient guidance or working examples to be able 
to use the claimed nucleic acids diagnostically or therapeutically, without undue experimentation. 

Applicants respectfiiUy traverse. 

As an initial matter, Applicants submit that in the discussion of the 35 U.S.C. § 101 
rejection above. Applicants have established a substantial, specific, and credible utility for the 
claimed nucleic acids. AppUcants therefore request that the PTO reconsider and withdraw the 
enablement rejection to the extent that it is based on a lack of utility for the claimed nucleic 
acids. 

As amended, the pending claims are to nucleic acids that have at least 95% or 99% 
nucleic acid sequence identity to the recited sequence and is "more highly expressed in 
esophageal tumor and kidney tumor tissue compared to normal esophageal and normal kidney 
tissue, respectively." Other claimed nucleic acids can hybridize to the recited sequences under 
stringent conditions. 

Applicants submit that the claimed nucleic acids are enabled, as one of skill in the art 
would know how to make and use them. It is well-established in the art how to make the claimed 
nucleic acids which have at least 95% or 99% sequence identity to the disclosed sequences 
related to SEQ ID NO: 81. Likewise, Applicants have disclosed how to determine if the recited 
nucleic acids are differentially expressed in esophageal and kidney tumors compared to their 
normal counterparts {see, e.g.. Example 18 beginning at paragraph [0529] of the specification). 
Finally, it is well-known in the art how to determine if the recited nucleic acids hybridize to the 
disclosed sequences under the specified stringent conditions. Thus, one of skill in the art would 
know how to make the claimed nucleic acids. 

As discussed above, Applicants submit that they have established that one of skill in the 

art would beUeve that it is more likely than not that the PR01557 gene is differentially expressed 

in esophageal and kidney tumors. Given the disclosure in the specification and the level of skill 

in the art, a skilled artisan would know how to use the claimed nucleic acids as diagnostic tools. 
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For example, nucleic acids which have at least 95% or 99% sequence identity to the disclosed 
sequences and are "more highly expressed in esophageal tumor and kidney tumor tissue 
compared to normal esophageal and normal kidney tissue, respectively" can be used as 
diagnostic tools since the claimed nucleic acids are themselves differentially expressed in certain 
tumors. A nucleic acid which has at least 95% or 99% sequence identity to the disclosed 
sequences and hybridizes to the disclosed sequences imder the specified stringent conditions can 
be used as a hybridization probe to detect the expression of the PR01557 gene, making it useful 
as a diagnostic tool. Given the skill in the art and the disclosure of how to make and use the 
claimed nucleic acids, Applicants request that the PTO reconsider and withdraw its rejection 
under 35 U.S.C. § 1 12, first paragraph. 

Rejection under 35 U.S,C. SI 12, first paragraph - Written Description 

The PTO has rejected Claims 1-6, 9, 10 and 14-20 under 35 U.S.C. §112, first paragraph, 
as containing subject matter which was not described in the specification in such a way as to 
reasonably convey to one skilled in the art that the inventors, at the time the application was 
filed, had possession of the invention. According to the PTO, because the claims do not require 
that the claimed nucleic acids or encoded polypeptides possess any particular biological activity, 
particular conserved structure, or other disclosed distinguishing feature, the claims fail the 
written description requirement. The PTO states that the claims are drawn to a genus of nucleic 
acids that is defined only by sequence identity. Finally, the PTO states that the only factor 
present in the claim is a partial structure in the form of a recitation of percent identity. The PTO 
concludes that in the absence of sufficient recitation of distinguishing identifying characteristics, 
the specification does not provide adequate written description of the claimed genus. 

The Legal Standard for Written Description 

The well-estabUshed test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112 , first paragraph is whether the disclosure "reasonably conveys to 
artisan that the inventor had possession at that time of the later claimed subject matter." In re 
Kaslow, 707 F.2d 1366, 1375, 2121 USPQ 1089, 1096 (Fed. Cir. 1983); see also Vas-CatK Inc. 
V. Mahurkar, 935 F.2d at 1563, 19 USPQ2d at 1116 (Fed. Cir. 1991). The adequacy of written 
description support is a factual issue and is to be determined on a case-by-case basis. See e.g., 
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Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 USPQ2d at 1116 (Fed. Cin 1991). The factual 
determination in a written description analysis depends on the nature of the invention and the 
amount of knowledge imparted to those skilled in the art by the disclosure. Union Oil v. Atlantic 
Richfield Co., 208 F.3d 989, 996 (Fed. Cir. 2000). 

The Current Invention is Adequately Described 

As noted above, whether the Applicants were in possession of the invention as of the 
effective filing date of an application is a factual determination, reached by the consideration of a 
number of factors, including the level of knowledge and skill in the art, and the teaching 
provided by the specification. The inventor is not required to describe every single detail of 
his/her invention. An Applicant's disclosure obligation varies according to the art to which the 
invention pertains. The present invention pertains to the field of recombinant DNA/protein 
technology. It is well-established that the level of skill in this field is very high since a 
representative person of skill is generally a Ph.D. scientist with several years of experience. 
Accordingly, the teaching imparted in the specification must be evaluated through the eyes of a 
highly skilled artisan as of the date the invention was made. 

The subject matter of the pending claims concerns nucleic acids having 95% or 99% 
sequence identity to the nucleic acid sequence of SEQ ID N0:81, the full-length coding sequence 
of the nucleic acid sequence of SEQ ID N0:81, or the fiilHength coding sequence of the cDNA 
deposited under ATCC accession number 203317, with the fimctional recitation as amended: 
"wherein said nucleic acid is more highly expressed in esophageal tumor and kidney tumor tissue 
compared to normal esophageal and normal kidney tissue, respectively." Other claimed nucleic 
acids hybridize to the nucleic acid sequence of SEQ ID N0:81, the fiiU-length coding sequence 
of the nucleic acid sequence of SEQ ID N0:81, the fiilHength coding sequence of the cDNA 
deposited under ATCC accession nimiber 203317, or the complements thereof, under the 
specified stringent conditions. We tum first to the claims which recite specific high stringency 
hybridization conditions. 

In Enzo Biochem v. Gen-Probe Inc., 323 F.3d 956 (Fed. Cir. 2002), the Court held that 
functional descriptions of genetic material may satisfy the written description requirement. In so 
holding, the Court gave judicial notice to the USPTO's Manual of Patent Examining Procedure, 
which provides that the written description requirement may be satisfied when the disclosure 
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provides sufficiently detailed identifying characteristics, such as "complete or partial structure, 
other physical and/or chemical properties, functional characteristics when coupled with a known 
or disclosed correlation between function and structure, or some combination of such 
characteristics." Id. at 964, quoting 66 Fed. Reg. at 1106 (emphasis in original). In Enzo, the 
Court found describing nucleic acids based on their ability to hybridize to another nucleic acid 
sequence which was adequately described may be an adequate description of the nucleic acid. 
This is because the hybridization function of a nucleic acid is dependent on the sequences of the 
nucleic acid - a disclosed function which is coupled with a known correlation between function 
and structure. The Court favorably discussed the PTO's example wherein "genus claims to 
nucleic acids based on their hybridization properties... may be adequately described if they 
hybridize under highly stringent conditions to known sequences because such conditions dictate 
that all species within the genus will be structurally similar ." Id. at 967 (citing Application of 
[Written Description] Guidelines, Example 9) (emphasis added). 

Applicants submit that the stringent hybridization conditions specified in the pending 
claims, alone or in combination with the recited percent sequence identity, result in all species 
within the genus being structurally similar. As the Enzo Court noted. Examples 9 and 10 of the 
Application of Written Description Guidelines (hereinafter "Guidelines") make clear that 
specifying hybridization under highly stringent conditions yields "structurally similar DNAs." » 
Guidelines, Example 9 at page 36. The analysis of a genus claim in Example 10 of the 
Guidelines states: 

[T]uming to the genus analysis, the art indicates that there is no substantial 
variation within the [claimed] genus because of the stringency of hybridization 
conditions which yields structurally similar molecules. The single disclosed 
species is representative of the genus because reduction to practice of this species, 
considered along with the defined hybridization conditions and the level of skill 
and knowledge in the art, are sufficient to allow the skilled artisan to recognize 
that applicant was in possession of the necessary common attributes or features of 
the elements possessed by the members of the genus. Guidelines, Example 10 at 
page 39 (emphasis added). 

Given the level of skill in the art, specifying highly stringent conditions leads to "no 
substantial variation within the [claimed] genus," and therefore a skilled artisan would recognize 
that the Applicants were in possession of the necessary common attributes or features of the 
genus. This is contrary to the PTO's argument the claimed sequences do not possess "any 
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particular conserved structure, or other disclosed distinguishing feature." Office Action at 1 1 . 
The common element or attribute of the claimed genus of nucleic acids is that species of the 
genus contain a nucleic acid which is structurally related to SEQ ID NO: 81, such that the nucleic 
acids hybridize to SEQ ED NO: 81 or the related sequences under the specified high stringency 
conditions recited in the claims. 

The present situation is not analogous to Fiddes v. Baird, 30 U.S.P.Q. 2d 1481, cited by 
the PTO. Unlike Fiddes, where arguably the structure of other mammalian sequences could not 
be conceived based on a single species of the genus, here the skill in the art is such that the 
sequence of nucleic acids which hybridize to SEQ ID NO: 81 xmder the conditions specified can 
be conceived. Here, the claimed genus is defined by its structure - members of the genus 
hybridize under the specified conditions to the specified sequences, each of which are adequately 
described in the specification. 

Applicants submit that the pending claims relating to nucleic acids having 95% or 99% 
sequence identity to the nucleic acids related to SEQ ID N0:81 with the functional recitation 
' Vherein said nucleic acid is more highly expressed in esophageal tumor and kidney tumor tissue 
compared to normal esophageal and normal kidney tissue, respectively" are also adequately 
described. In Example 14 of the written description training materials, the written description 
requirement was found to be satisfied for claims relating to polypeptides having 95% homology 
to a particular sequence and possessing a particular catalytic activity, even though the applicant 
had not made any variants. Similarly, the pending claims also have very high sequence 
homology to the disclosed sequences and must share the same expression pattern in certain 
tumors. In Example 14, the procedures for making variants were known in the art and the 
disclosure taught how to test for the claimed catalytic activity. Similarly, in the instant 
appUcation, it is well known in the art how to make nucleic acids which have at least 95% 
sequence identity to the disclosed sequences, and the specification discloses how to test to 
determine if the nucleic acid sequence is differentially expressed in esophageal or kidney tumors. 
Like Example 14, the genus of nucleic acids that have at least 95% or 99% sequence identity to 
the disclosed sequences will not have substantial variation since all of the variants must have the 
same expression in certain tumors. 

Furthermore, while Applicants appreciate that actions taken by the PTO in other 
applications are not binding with respect to the examination of the present application. 
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Applicants note that the PTO has issued many patents containing claims to variant nucleic acids 
or variant proteins where the applicants did not actually make such nucleic acids or proteins. 
Representative patents include U.S. Patent No. 6,737,522, U.S. Patent No. 6,395,306, U.S. Patent 
No. 6,025,156, U.S. Patent No. 6,645,499, U.S. Patent No. 6,498,235, and U.S. Patent No. 
6,730,502, which are submitted herewith as Exhibits 14-19. 

In conclusion, Applicants submit that they have satisfied the written description 
requirement for the pending claims based on the actual reduction to practice of SEQ E) NO: 81, 
by specifying the high stringency conditions imder which hybridization occurs, and by describing 
the gene expression assay, all of which result in a lack of substantial variability in the species 
falling within the scope of the instant claims. Applicants submit that this disclosure would allow 
one of skill in the art to "recognize that the applicant was in possession of the necessary common 
attributes or features of the elements possessed by the members of the genus." Hence, 
Applicants respectfully request that the PTO reconsider and withdraw the written description 
rejection under 35 U.S.C. §112. 

Rejections under 35 U.S.C. § 112, second paragraph - Indefiniteness 

The PTO has rejected Claim 15 under 35 U.S.C. § 112, second paragraph, as being 
indefinite. The PTO objects to the use of "stringent conditions." Claim 15 has been canceled. 

The PTO has also rejected Claims 1-6, 9, 10, and 14, and dependent claims 7, 8, 11-13, 
16 and 17-20, under 35 U.S.C. § 1 12, second paragraph, as being indefinite. The PTO objects to 
the recitation of "the extracellular domain" allegedly because no extracellular domain is 
identified. 

AppUcants have amended the claims to delete any reference to an extracellular domain. 
Claim 14 is further rejected as indefinite because the conditions for hybridization of the claimed 
nucleic acid are allegedly imclear. Claim 14 is amended to recite stringent conditions for 
hybridization of the claimed nucleic acid. In light of these amendments. Applicants request that 
the PTO withdraw the indefiniteness rejections under 35 U.S.C. §112, second paragraph. 
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Rejection under 35 U.S.C. S102(b) - Anticipation 

The PTO has rejected Claims 14-16 as anticipated under 35 U.S.C. §102(b) by GenBank 
Accession AA040433. According to the PTO, AA040433 discloses a nucleic acid that is 97% 
identical to SEQ ID NO: 81 over 381 consecutive bases. 

While Applicants do not acquiesce to the PTO's position that the nucleotide sequence of 
AA040433 is encompassed by the polynucleotide of Claim 14, Applicants have canceled Claims 
15, and amended Claims 14 and 16 such that the recited nucleic acid must be at least 450 
nucleotides, or at least about 500 nucleotides in length, respectively. The polynucleotide 
disclosed in AA040433 does not anticipate amended Claims 14 or 16. Accordingly, Applicants 
request that the PTO reconsider and withdraw the rejection of Claims 14 and 16 xmder 35 U.S.C. 
§ 102(b). 

The PTO has rejected Claims 1-10 and 12-20 as anticipated under 35 U.S.C. §102(b) by 
WO 00/70049. The PTO states that WO 00/70049 teaches a nucleotide sequence that is 1720 
nucleotide in length and 100% identical to nucleotides 1-1720 of Applicants' SEQ ID N0:81. 
As discussed above, the instant claimed subject matter has utility based upon the data in Example 
18 and the instant application is a continuation of PCT/USOO/23328; therefore, the present claims 
are entitled to the fiHng date of August 24, 2000. WO 00/70049 is not prior art under § 102(b). 

WO 00/70049 was published on November 23, 2000, which is subsequent to the filing of 
priority application PCT/USOO/23328 (August 24, 2000). Again, PCT/USOO/23328 discloses the 
differential expression data which provides utility for the instant claims, and Applicants are 
entitled to the filing date of August 24, 2000. Therefore, WO 00/70049 cannot be cited under 
§ 102(b). 

In view of the above discussion, reconsideration and withdrawal of the rejection under 
§ 102(b) is respectfiilly requested 
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CONCLUSION 

In view of the above, Applicants respectfully maintain that claims are patentable and 
request that they be passed to issue. Applicants invite the Examiner to call the undersigned if any 
remaining issues may be resolved by telephone. 

Please charge any additional fees, including any fees for additional extension of time, or 
credit overpayment to Deposit Account No. 11-1410. 



Respectfully submitted. 



KNOBBE, MARTENS, OLSON & BEAR, LLP 




Attomey of Record 
Customer No. 30,313 
(619) 235-8550 



1 520828.1 
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DELETION OF INVENTORS 

Please correct the inventorship under 37 CFR § 1.48(b) by removing the following 
inventors from the present application: 

Dan L. Eaton, Ellen FilvarofF, Mary E. Gerritsen, and Colin K. Watanabe. 
Applicants request that these inventors be deleted, as their inventions are no longer being claimed 
in the present application as a result of prosecution. 
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DECLARATION OF J, CHRISTOPHER GRIMALDL UNDER 37 CFR §1,132 

Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Dear Sir: 

I, I Christopher Grimaldi, declare and state as follows: 

1. I am a Senior Research Associate in the Molecular Biology Department of 
Genentech, Inc, South San Francisco, CA 94080. 

2. My scientific Curriculum Vitae, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). 

3. I joined Genentech in January of 1999. From 1999 to 2003, 1 directed the Cloning 
Laboratory in the Molecular Biology Department During this time I directed or performed 
numerous molecular biology techniques including semi-quantitative Polymerase Chain Reaction 
(PGR) analyses, I am currently mvolved, among other projects, in the isolation of genes coding 
for membrane associated proteins which can be used as targets for antibody therapeutics against 
cancer. In connection with the above-identified patent application, I personally performed or 
directed the semi-quantitative PCR gene expression analyses in the assay entitled "Tumor Versus 
Normal Differential Tissue Expression Distribution " which is described in EXAMPLE 18 in. the 
specification. These studies were used to identify differences in gene expression between tumor 
tissue and their normal counterparts, 

4. EXAMPLE 18 reports the results of the PCR analyses conducted as part of the 
investigating of several newly discovered DNA sequences. This process included developing 
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primers and analyzing expression of the DNA sequences of interest in nonnal and tumor tissues. 
TTie analyses were designed to determine whether a difference exists between gene expr^sion in 
nonnal tissues as compared to tumor in the same tissue type. 

5. The DNA libraries used in the gene expression studies were made from pooled 
samples of normal and of tumor tissues. Data from pooled samples is more likely to be accurate 
than data obtained from a san^le from a single individual. That is, the detection of variations in 
gene expression is likely to represent a more generally relevant condition when pooled samples 
from nonnal tissues are compared with pooled samples from tumors in the same tissue type. 

6. In differential gene expression studies, one looks for genes whose expression levels 
differ significantly under different conditions, for example, in norma] versus diseased tissue. 
Thus, I conducted a semi-quantitative analysis of the expression of the DNA sequences of 
interest in normal versus tumor tissues. Expression levels were graded according to a scale of +, - 
, and +/- fo iijdicate the aniount of the specific signal detected. Using the widely accepted 
technique of PGR, it was determined whether flie polynucleotides tested were more highly 
expressed, less expressed, or whether expression remained the same in tumor tissue as compared 
to its normal counterpart. Because ttiis technique relies on the visual detection of ethidium 
bromide staining of PGR products on agarose gels, it is reasonable to assume that any detectable 
differences seen between two samples will represent at least a two fold difference in cDNA. 

7. The results of the gene expression studies indicate that the genes of interest can be 
used to differentiate tumor from normal. The precise levels of gene expression are irrelevant; 
what matters is fliat there is a relative difference in expression between normal tissue and tumor 
tissue. The precise type of tumor is also irrelevant; agsin, the assay was designed to indicate 
whether a difference exists between normal tissue and tumor tissue of the same type. If a 
difference is detected, this indicates that the gene and its corresponding polypeptide and 
antibodies against the polypeptide are useful for diagnostic purposes, to screen samples to 
differentiate between nonnal and tumor. Additional studies can then be conducted if further 
information is desired. 



8. I hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information or belief are believed to be true, and further that these 
statements were made with the knowledge that willful frdse statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that such willful statements may jeopardize the validity of the application or any 
patent issued thereon, 

J Ghristopher Grimaldi 
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EXHIBIT A 



J. Christopher Grimaldi 

1434-36* Ave. 

San Fkrandsco, CA 94122 

(415) (»8M639 (Home) 

EDUCATION Universily of California, Bwkeiey 

Bachelor of Arts in Molecular Biology, 1984 



EMPLOYMENT EXPERIENCE 



SRA GenMitech Inc., South San Ftamcisco; 1/99 to present 

Previously, was responsible to direct and manage the Cloning Lab. Currently focused on 
isdating cancer specific gpnes for the Tumor Antigen (TAP), and Secreted TXimor Protein 
(STOP) projects for the Oncology Department as weU as ImmunologicaUy relevant genes for the 
Immunology Department. Directed a lab of 6 scientists focused on a company-wide team effort 
to identify and isolate secreted proteins for potential therapeutic use (SPDI). For the SPDI project 
my duties were, among other things, the criticaUy important coordination of the cloning of 
Aousands of putative genes, by developing a smooth process of communication between the 
Bioinformatics, Cloning, Sequencing, and Legal teams. Collaborated with several groups to 
discover novel genes through the Curagen project, a unique differential display methodology 
Interacted extensively with the Legal team providing essential data needed for filing patents on 
novel genes discovered through the SPDI, TAP and Curagen projects. My group has developed 
implemented and patented hi^ throughput cloning methodologies that have proven to be 
essential for the isolation of hundreds of novel genes for the SPDI, TAP and Curagen projects as 
well as dozms of oth^ smaller projects. 



Scientist DNAX Research Institute, Palo Alto; 9/91 to 1^9 

Involved in multiple projects aimed at understanding novel genes discovered through 
bioinformatics studies and functional assays. Developed and patented a method for the specific 
depletion of eosinophils in vivo using monoclonal antibodies. Developed and implemented 
essential technical methodologies and provided strategic direction in the areas of expression, 
cloning, protein purification, general molecular biology, and monoclonal antibody production. 
Teamed and supervised numoous tecbnical.staff. 

Facilities 

Manager Corixa, Redwood City; 5/89 -7/91. 

Directed plant-related activities, which included expansion planning, maintenance, safety, 
purchasmg, inventory control, shipping and receiving, and laboratory management. Designed 
and implemented the safety program. Also served as liaison to regulatory agencies at the local 
state and federal level. Was in charge of property leases, leasehold improvements, etc. 
Negotiated vendor contracts and directed the purchasing department. Trained and supervised 
personnel to carry out the above-mentioned duties. 



SRA University of California, San Francisco 

Cancer Research Institat^ 2/87-4/89. 



Was responsible for numerous cloning projects including: studies of somatic hypeimutalion, 
studies of AIDS-associalBd lymphomas, and cloning of t(5: 14), t(l 1 ; 14), and t(8; 14) 
transloc^ons. Focused on the activation of hemopoietic growth factors involved in the t(5;14) 
translocation in laikemia patients.. 



Research 

Technician Berlex Biosciences, South San Francisco; 7/85-2/87. 

Worked on a subunit porcme vaccine directed against Mycoplasma hyopnenmoniae. Was 
responsible for geqerating genomic libraries, sqreening with degenerate oligonucleotides, and 
characlenzing and expressing clones m E. coU. Also constructed a general purpose expression 
vector for use by aOtex scimtific teams. 

POBUCATIONS 

1. Hilary F. Clark, et al. "The Secreted Protem Discovery Initiative (SPDI), a Large-scale 
I^ort to Identify Novel Human Secreted and Transmembrane Proteins: a bioinformatics 
assessment" Genome Res.Tol 13(10), 2265-2270, 2003 

2. Sean H. Adams, Clarissa Oiui' Sarah L. Schilbach, Xing Xian Yu, Audrey D. Goddard, J. 
Christopher Grimaldi, James Lee, Patrick Dowd, David A. Lewin, & Steven Cohnan'"BFIT, 
a Unique Acyl-CoA Hiioesterase Induced in Thermogenic Brown Adiopose Tissue: Qoning, 
organization of the humanb gene and assessment of a potomtial link to obesity" Biochraooical 
Journal, Vol 360, 135-142, 2001 

3. Szeto W, Jiang W, Tice DA, Rubmfeld B, Hollingshead PG, Pbng SE. Dugger DL, Pham T, 
Yansura D, Wong TA, Grimaldi JC, Corpuz RT. Singh JS, Frantz GD, , Devaux B, Crowley 
CW, SchwallRH, Eberhard DA,.Rastem L, Polakis.P. and Eennica D. "Overexpression of 
the Retenoic Acid-Responsive Gwie Stra6 in Human Cancers and its Synergistic Activation 
by Wnt-1 and Retinoic Acid." Cancec Research Vol. 61(10). 4197-4205, 2001 

4. Jeanne Kahn, Fuad Mehraban, Gladdys Ingle, Xiaohua Xin, JuUet E. Bryant, Goidon Vehar, 
Jill Schoenfeld. J. Christopher Grunaldi (incorrectly named as "Grimaldi, CP'). Frankhn 
Peale, Apama Draksharapu, David A. Lewin, and Mary B. Gerritsen. "Gene Expression 
Profiling in an in Vitro Model of Angiograiesis." Amoican Journal of P&tholosv Vol 156f 6) 
1887-1900. 2000. 

5. Grimaldi JC, Yu NX, Grunig G, Seymour BW, Cottrez F, Robinson DS. Hosken N, Feriin 
WG, Wu X, Soto H, O'Garra A, Howard MC, Coffman RL. "Depletion of eosinophils in 
mice through the use of antibodies specific for C-C chemokine receptor 3 (CCR3). Journal of 
Leukocyte Biology; Vol 65(6), 846-53, 1999 

6. OHver AM. Grimaldi JC, Howard MC. Kearney JF. "Independently Ugating CD38 and Fc 
gammaRUB relays a dominant negative signal to B cells." Hybridoma Vol. 18(2) 113-9 
1999 V /> . 
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7. Cockayne DA, Muchamuel T, Grimaldi JC, Muller-Stef&er H, RandaU TD, Lund FE, 
Murray R, Schuber F, Howard MC. '*Mice deficient for the ecto-nicotinaimde adenine 
dinucleotide glycohydrolase CD38 exhibit altered humoral immune responses." Blood Vol 
92(4), 1324-33. 1998 

8. Frances E. Lund. Nanette W. Solvason, Michael P. Cooke, Andrew W, Heath, j: Christopher 
Grimaldi, Troy D. Randall, R. M. E. Parkhouse. Christopher C Goodnow and Maureen C. 
Howard. "Signaling through murine CD38 is impaired m antigen receptor unresponsive B 
cells." European Journal of Immunology, VoL 25(5), 1338-1345, 1995 

9. M. J, Guimaraes, J. F. Bazan, A. Zolotnik, M. V, WUes, J. C. Grimaldi, F. Lee, T. 
McClanahan. "A new approach to the study of haematopoietic development in the yolk sac 
and embryoid body." Development, VoL 121(10), 3335-3346, 1995 

10. J. Christoph^ Grimaldi, Sriram Balasubramanian, J. Fernando Bazan, Armen Shanafelt, 
Gerard Zurawski and Maureen Howard. "CD38-raediated protein ribosylation." Journal of 
Immunology,VoL 155(2), 811-817, 1995 

11. Leopoldo Santos-Argumedo, Frances F. Lund, Andrew W. Heath, Nanette Solvason, Wei 
Wei Wu, J. Christopher Grimaldi, R. M. E. Parkhouse and Maureen Howard. "CD38 
unresponsiveness of xid B cells implicates Bmton's tyrosine kinase (btk) as a regulator of 
CD38 induced signal transduction." International Immunology, Vol 7(2), 163-170, 1995 

12. FrancesXund, Nanette Solvason, L Christopher Grimaldi, R. M. B. Parkhouse and Maureen 
Howard. "Murine CD38: An unmunoregulatoiy ectoenzyme." Lnmunoloev Todav Vol 
16(10), 469-473. 1995 

13. Maureen Howard, J. Christoph^ Grimaldi, J. Fernando Bazan, Frances E. Lund, Leopoldo 
Santos-Argumedo, R. M. E. Parkhouse, Timothy F, Walseth, and Hon Cheung Lee. 
"Formation and Hydrolysis of Cyclic ADP-Ribose Catalyzed by Lymphocyte Antigen 
CD38." Science,VoI.262, 1056-1059, 1993 

14. NobuyukiJHarada, LeopoIdo.Santos-Argumedo, Ray Chang, J. Christopher Grimaldi, Frances 
Lund, Camilynn 1. Brannan. Neal G. Copeland, Nancy A. Jenkins, Andrew Heath, R. M, B. 
Parkhouse and Maureen Howard. "Expression Cloning of a cDNA Encoding a Novel Murine 
B Cell Activation Marker: Homology to Human CD38." Hie Journal of Immunolosv* Vol 
151,3111-3118,1993 

15. David J. Rawlings, Douglas C Saffran, Satoshi Tsukada, David A. Largaespada, J, 
Christopher Grimaldi, Lucie Cohen Randolph N. Mohr, J. Fernando Bazan, Maureen 
Howard, Neal G. Copeland, Nancy A. Jenkins, Owen Witte. "Mutation of Unique Region of 
Brutorfs Tyrosine Kinase in Immunodeficient XBD Mice," Science, Vol. 261, 358-360, 1993 

16. J. Christopher Grimaldi, Raul Torres, Christine A. Kozak, Ray Chang, Edward Clark, 
Maureen Howard, and Debra A. Cockayne. "Genomic Structure and Chromosomal Mapping 
of the Murine CD40 Gene." The Journal of Immunology, Vol 149, 3921-3926, 1992 

17. Timothy C, Meeker, Bruce Shiramizu, Lawrence Kaplan, Brian Hemdier, Henry Sanchez, J. 
Christopher Grimaldi, James Baumgartner, Jacab Rachlin, Ellen Feigal, Mark Rosenblum and 
Michael S. McGrath. "Evidence for Molecular Subtypes of HIV-Associated Lymphoma: 
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Division into Peripheral Monoclonal, Polyclonal and Central Nervous System Lymphoma." 
AIDS, Vol 5, 669-674. 1991 / f 

18. Ann Grimaldi and Chris GrimaldL "Small-Scale Lambda DNA Prep." Contribution to 
Current Protocols in Molecular Biology, Supplement 5, Winter 1989 

19* J. Christopher Grimaldi, Timothy C. Meeker. "The t(5;14) Chromosomal Translocation in a 
Case of Acute Lymphocytic Leukemia Joins the Interleukin-3 Gene to the Immunoglobulin 
Heavy Chain Gene." Blood, Vol. 73, 2081-2085, 1989 
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POLYMORPHIC HUMAN PC-1 SEQUENCES 
ASSOCIATED WITH INSULIN RESISTANCE 

CROSS-REFERENCE 

This application claims priority to provisional patent ^ 
application No, 60/108,853, filed Nov. 18, 1998. 

Insulin resistance occurs in 25% of non-diabetic, non- 
obese, apparently healthy individuals, and predisposes them 
to both diabetes and coronary artery disease. Diabetes mel- 
litus is a major health problem in the United States affecting 
approximately 7% of the population. The most common 
form of diabetes mellitus is non-insulin-dependent diabetes 
mellitus (NIDDM or type II diabetes). Hyperglycemia in 
type II diabetes is the result of both resistance to insulin in 
muscle and other key insulin target tissues, and decreased 
beta cell insulin secretion. Longitudinal studies of individu- 
als with a strong family history of diabetes indicate that the 
insulin resistance precedes the secretory abnormalities. Prior 
to developing diabetes these individuals compensate for 
their insulin resistance by secreting extra insulin. Diabetes 
results when the compensatory hyperinsulinemia fails. The 
secretory deficiency of pancreatic beta cells then plays a 
major role in the severity of the diabetes. 

Reaven (1988) Diabetes 37:1595-607 were the first to 
have investigated insulin resistant, non-diabetic, healthy 
individuals from the general population who are non-obese. 
Strikingly, they observed that 25% of them have insulin 
resistance that is of a similar magnitude to that seen in type 
11 diabetes patients. These individuals compensate by having 
insulin levels that are 3-4 times higher than normal. These 
elevated insulin levels are sufficient to maintain normogly- 
cemia. Others have also confirmed that a large proportion of 
the non-diabetic population is insulin resistant. These insulin 
resistant, non-diabetic individuals have a much higher risk 
for developing type II diabetes than insulin sensitive sub- 
jects. 

However, even without developing hyperglycemia and 
diabetes, these insulin resistant individuals pay a significant 
price in terms of general health. Insulin resistance results in 
an increased risk for having elevated plasma triglycerides 
(TG), lower high density lipoproteins (HDL), and high 
blood pressure, a cluster of abnormalities that have been 
termed by different investigators as either Syndrome X, the 
insulin resistance syndrome, or the metabolic syndrome. It is 45 
believed that either the hyperinsulinemia, insulin resistance, 
or both play a direct role in causing these abnormalities. 
Data from ethnic, family, and longitudinal studies suggest 
that a major component of resistance is inherited. 

The cellular response to insulin is mediated through the 50 
insulin receptor (IR), which is a tetrameric protein consist- 
ing of two identical extracellular alpha subunits that bind the 
hormone and two identical transmembrane beta subunits that 
have intracellular tyrosine kinase activity. When insulin 
binds to the IR alpha subunit, the beta subunit tyrosine 55 
kinase domain is activated, and insulin action ensues. When 
insulin activates the receptor, the beta subunit is autophos- 
phorylated at the juxtamembrane domain, the tyrosine 
kinase domain and the C-terminal domain. Subsequently, 
endogenous substrates including IRS-1, IRS-2 and SHC are 
tyrosine phosphorylated. These phosphorylated substrates 
act as docking molecules to activate SH2 domain molecules 
including: GRD-2 which activates the ras pathway; the p85 
subunit of PI-3-kinase; protein tyrosine phosphatase PTP2/ 
SYP; PLCy/NCK; AKT and others. 55 

PC-1 is a class II transmembrane glycoprotein that is 
located both on plasma membranes and in the endoplasmic 



reticulum. PC-1 is the same protein as liver nucleotide 
pyrophosphatasc/alkalinc phosphodiesterase I. In addition to 
muscle tissue, PC-1 has been reported to be expressed in 
plasma and intracellular membranes of plasma cells, 
placenta, the distal convoluted tubule of the kidney, ducts of 
the salivary gland, epididymis, proximal part of the vas 
deferens, chondrocytes and dermal fibroblasts. PC-1 exists 
as a disulfide linked homodimer of 230-260 kDa; the 
reduced form of the protein has a molecular size of 115-135 
kDa, depending on the cell type. Human PC-1 is predicted 
to have 873 amino acid.s. 

PC-1 is inserted into the plasma membrane such that there 
is a small cytoplasmic amino terminus, and a larger extra- 
cellular carboxyl terminus. The extracellular domain of 
PC-1 has a high cysteine region that is involved in dimer 
formation, an ATP binding site and enzymatic activity which 
cleaves sugar-phosphate, phosphosulfate, pyrophosphate, 
and phosphodiesterase linkages. The active enzyme site for 
phosphodiesterase and pyrophosphatase contains a key 
threonine residue, however a mutation of this residue does 
not impair the ability of PC-1 to inhibit IR function. 

Belli et al. (1993) £Mr 7 Biochem 217(l):421-8 discloses 
the existence of enzymatically active water-soluble forms of 
PC-1. Biosynthetic studies revealed a single, monomeric, 
endoglycosidase-H-sensilive membrane PC-1 precursor, 
which was gradually converted to a disulphide-bonded, 
cndoglycosidasc-H-resistant form. The soluble form of PC-1 
does not appear to arise by proteolytic cleavage from the cell 
surface, although cleavage inside the cell remains a possi- 
bility. The data suggest that the most likely site of cleavage 
is between Pro 152 and Ala 153. 

PC-1 levels are increased in fibroblasts from most patients 
with typical NIDDM and insulin resistance. In addition, 
overcxpression of PC-1 in transfectcd cuhured cells reduces 
insulin -stimulated tyrosine kinase activity (Goldfine et al. 
(1998) Mol Cell Biochem 182:177-184). PC-1 content in 
fibroblasts negatively correlates with both decreased in vivo 
insulin sensitivity and decreased in vitro IR autophospho- 
rylation (Frittitta et al. (199S) Diabetes 47:1095-1100). 

In cells from insulin-resistant subjects, insulin stimulation 
of glycogen synthetase was decreased. PC-1 content is also 
elevated in fibroblats, muscle and fat of non-diabetic insulin 
resistant subjects. The elevation of PC-1 content may be a 
primary factor in the cause of insulin resistance, although the 
mechanism by which PC-1 inhibits insulin receptor activity 
is unknown. 

Many mechanisms may potentially contribute to insulin 
resistance. One major mechanism is the impairment of 
insulin receptor tyrosine kinase (IR-TK) activity, a key step 
in insulin receptor signalling. Several inhibitors of IR-TK 
have been associated to insulin resistance. Among them is 
PC-1, a class 11 transmembrane glycoprotein that is overex- 
pressed in tissues of insulin resistant subjects. The human 
PC-1 gene has been assigned to the same chromosomal 
region (6q22-q23) where both STS D6S290 (which has been 
linked to type 2 diabetes in Mexican-Americans), and the 
gene responsible for transient neonatal diabetes map. The 
identification and characterization of genetic sequences 
involved in insulin resistance is of great medical interest. 

Database References for Genetic Sequences 

The human cDNA and encoded amino acid sequence for 
PC-1 may be accessed in Genbank, M57736 J05654. As a 
reference, the ''K" allele is provided herein as SEQ ID N0:1, 
and the encoded polypeptide as SEQ ID N0:2. The "Q" 
allele is provided as SEQ ID NO:3, and the encoded 
polypeptide as SEQ ID N0:4. 
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SUMMARY OF THE INVENTION 

Human PC-1 nucleic acids and polypeptides are provided, 
includiag promoter and intron-exon boundaries. Polymor- 
phic sequences are provided that encode a form of the 
protein associated with increased insulin resistance, where a 
naturally occurring polymorphism of interest comprises a 
Iys-*glu substitution at position 121 of the protein, in the 
high cysteine region. Also provided are polymorphisms in 
the 3' untranslated region of PC-1. The subject nucleic acids 
and fragments thereof, encoded polypeptides, and antibodies 
specific for the polymorphic amino acid sequence are useful 
in determining a genetic predisposition to insulin resistance. 
The encoded protein is useful in drug screening for compo- 
sitions that affect the activity of PC-1 and insulin receptor 
activity or expression. Screening methods that analyze 
plasma levels of soluble PC-1 are also provided, where 
convenient quantitation of PC-1 content is used in diagnosis 
of insulin resistance. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGS. lA and IB. Sequence analysis of PC-1 exon 4. 
Arrows point to nucleotide N. The 0 allele sequence is 
depicted in the upper panel, the K allele in the lower. The 
Avail restriction enzyme recognition site is underlined in the 
Q allele sequence. Avail digestion of PC-1 exon 4 amplimers 
from 7 different genomic DNAs. The 238 bp amplimer is 
completely digested in the QQ sample, resulting two smaller 
148 and 90 bp fragments. While KK samples remain 
undigested, KQ samples reveal a partial (50%) digestion. 

FIGS. 2(A) and 2(B). Plasma glucose (A) and insulin (B) 
levels during an OGTT (75 g) in Q allele carriers (n=33, 
white circles) and KK subjects (n=68, black circles). 
§=p<0.05 and *=p<0.01 vs. KK subjects. 

FIG. 3. Insulin receptor autophosphorylation in fibro- 
blasts from 0 allele carriers (n=5, white circles) and KK 
subjects (n=5, black circles). This function was determined 
by exposing cells for 10 minutes to increasing insulin 
concentrations (0-100 nM). Cells were then solubilized and 
the insulin receptor immunocaptured on plastic wells pre- 
coated with a monoclonal antibody specific to the insulin 
receptor. After washing, a biotinylated antiphospholyrosine 
antibody was added followed by peroxidase-conjugated 
streptavidin detection assay. Data are expressed as arbitrary 
units, normalized for protein content. **=p<0.01 vs. KK 
.subjects. 

DESCRIPTION OF THE SPECIFIC 
EMBODIMENTS 

Methods and compositions are provided for diagnosing a 
predisposition to human insulin resistance. The methods 
comprise an analysis of germline DNA for a predisposing 
polymorphism in the gene encoding PC-1, where presence 
of the altered gene confers an increased susceptibility to 
insulin resistance. Human PC-1 gene and gene product 
compositions are provided that encode specific polymorphic 
forms of PC-1. Polymorphisms of interest include a coding 
change at position 121 of the protein, and polymorphisms of 
the 3' UTR. 

In another embodiment of the invention, the concentration 
of soluble PC-1 protein in patient plasma Is used as a 
diagnostic. PC-1 circulates in human plasma and low plasma 
PC-1 level is independently associated with several features 
of the insulin resistant "metabolic syndrome" including 
abdominal fat distribution, high blood pressure and, with 
respect to lipid metabolism, insulin resistance. 
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PC-1 is a class II membrane glycoprotein that inhibits 
activation of insulin receptor tyrosine kinase, and is associ- 
ated with insulin resistance. A novel polymorphism in exon 
4 of the PC-1 gene is significantly correlated with insulin 

5 resistance. The subject genes and fragments thereof, 
encoded protein, and antibodies specific for the insulin 
resistance associated forms of PC-1 are useful in character- 
izing patients for susceptibility to insulin resistance. Such 
screening methods may be used in conjunction with coun- 

10 seling and preventive measures. 

Nucleic Acid Compositions 

As used herein, the term PC-1 genes and encoded 
polypeptides shall be used to generally designate any of the 
mammalian PC-1 genes and gene products, and unless 
otherwise stated will be the human homolog. The sequences 
of the invention comprise a sequence polymorphism, gen- 
eraUy resulting in a change in coding sequence, that confer 
a susceptibility to insufin resistance, and may lead to hyper- 
glycemia and NIDDM. Such polymorphisms may be generi- 
cally referred to herein as a resistance associated PC-1 
sequence, or PC-l'^. Counseling and preventive measures 
are particularly important for such patients, and early diag- 
nosis provides information concerning such a predisposi- 
lion. 

The effect of a candidate sequence polymorphism on 
PC-1 expression or function may be determined by kindred 
analysis for segregation of the sequence variation with the 
disease phenotype, e.g. insulin resistance, hyperglycemia, 
etc. A predisposing mutation will segregate with incidence 
of the disease. The subject mutations generally have a 
dominant phenotype, where a single altered allele will 
confer disease susceptibility. The penetrance will vary with 
the specific mutation. 

As an alternative to kindred studies, biochemical studies 
are performed to determine whether a candidate sequence 
variation in the PC-1 coding region or control regions affects 
the quantity or function of the protein. The effect of a 
sequenc-e variation on the interaction between PC-1 and 
insulin receptor or other tyrosine kinases is determined by 
binding studies or kinase assays, where a decreased level of 
inhibition or binding is indicative of a predisposing muta- 
tion. Normal PC-1 will inhibit the tyrosine kinase activity of 

45 the insulin receptor, but not other tyrosine kinases. 

In one embodiment of the invention, polymorphisms of 
interest provide for amino acid substitutions in the extracel- 
lular domain of PC-1, particularly the cystcine-rich domain, 
which may substitute a charged amino acid with a neutral 

50 amino acid. In one embodiment of the invention the amino 
acid substitution is at a lysine residue in this region, e.g. 
K121 or K119. Polymorphisms at these residues, where the 
lysine is substituted with any of the other 19 naturally 
occurring amino acids, may be referred to generically as a 

55 [*121] PC-1 or [*119] PC-1 polymorphisms. Specific poly- 
morphisms of interest substitute a neutral amino acid in 
place of the lysine, particularly glutamine or arginine. A 
naturally occurring polymorphism associated with insulin 
resistance comprises a lys->glu substitution at position 121 

60 of the protein, herein referred to as "[K121Q] PC-l", or 
merely "[Q] PC-l". 

The human [Q] PC-1 amino acid sequence is provided as 
SEQ ID N0:4, and the encoding gene as SEQ ID N0:3. In 
order to identify the subject PC-1 polymorphisms, exonic 

65 primers from the published sequence data were used to 
isolate genomic clones. Sequence data from the genomic 
clones was used to generate specific primers. These primers 
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were used to amplify genomic DNA. The PGR products 
were screened for mutations using single strand conforma- 
tion polymorphism (SSCP) analysis. The specific polymor- 
phism found in SEQ ID N0:3 was identified in a number of 
patients. 

ONAencoding a protein may be cDNAor genomic 

DNA or a fragment thereof that encompasses Ihe altered 
residue, e.g. [*121] PC-1. As known in the art, cDNA 
sequences have the arrangement of exons found in processed 
mRNA, forming a continuous open reading frame, while 
genomic sequences may have introns interrupting the open 
reading frame. The term "[*121] PC-1 gene" shall be 
intended to mean the open reading frame encoding such 
specific PC-1 polypeptides, as well as adjacent 5' and 3' 
non-coding nucleotide sequences involved in the regulation 
of expression, up to about 1 kb t>eyond the coding region, in 
either direction. The intron-exon boundaries of the PC-1 
gene are provided in the examples. 

Genomic sequences nf interest comprise the nucleic acids 
present between the initiation codon and the stop codon, 
including all of the introns that are normally present in a 
native chromosome. It may include the 3' and 5' untranslated 
regions found in the mam re mRNA. It may further include 
specific transcriptional and translational regulatory 
sequences, such as promoters, enhancers, etc., including 
about 1 kb of flanking genomic DNA at either the 5' or 3' end 
of the coding region. The genomic DNA may be isolated as 
a fragment of 50 kbp or smaller; and substantially free of 
flanking chromosomal sequence. 

The genomic PC-1 5' and 5' sequence, including specific 
transcriptional and translational regulatory sequences, such 
as promoters, enhancers, etc., including about 1 kb, but 
possibly more, of flanking genomic DNA at the 5' end of the 
transcribed region, is of particular interest. The promoter 
region is useful for determining the pattern of PC-1 
expression, e.g. induction and inhibition of expression in 
various tissues, and for providing promoters that mimic 
these native patterns of expression. A polymorphic PC-1 
regulatory sequence, i.e. including one or more of the 
provided 3' UTR polymorphisms, is useful for expression 
studies to determine the effect of sequence variations on 
mRNA expression and stability. The polymorphisms are also 
used as single nucleotide polymorphisms to detect genetic 
linkage to phenotypic variation in activity and expression of 
PC-1 . llie polymorphic 3' UTR sequences are provided as 
SEQ ID N0:6 ("A" allele); SEQ ID N0:7 (" P" allele); and 
SEQ ID N0:8 ("N" allele). The polymorphisms are as 
follows: 



nucleotide position 


127 


136 


178 


SEQ ID NO: 6 


G 


G 


C 


SEQ ID NO: 7 


A 


C 


T 


SEQ ID NO: S 


A 


G 


T 



'llie promoter region of PC-1 is provided as SEQ ID 
NO: 5. rhe promoter region is useful for determining natural 
pallems of expression, particularly those that may be asso- 
ciated with disease. Alternatively, mutations may be intro- 
duced into the promoter region to determine the effect of 
altering expression in experimentally defined systems. The 
promoter also finds use in the construction of animal models 
where it is desirable to mimic the native patterns of PC-1 
expression. Methods for the identification of specific DNA 
moti& involved in the binding of transcriptional factors are 
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known in the art, e.g. sequence similarity to known binding 
motifs, gel retardation studies, etc. For examples, see Black- 
well et al. (1995) Mol Med 1: 194-205; Mortlock et al. 
(1996) Genome Res. 6: 327-33; and Joulin and Richard-Foy 
5 (1995) Eur J Biochem 232: 620-626. Specific regulatory 
motifs are found in the provided promoter sequence at 
positions: SEQ ID N0:5; 192-205; and wSEQ ID N0:5, 
453-458. 

The nucleic acid compositions of the subject invention 

^0 encode all or a part of the subject polypeptides. Fragments 
may be obtained of the DNA sequence by chemically 
synthesizing oligonucleotides in accordance with conven- 
tional methods, by restriction enzyme digestion, by PCR 
amplification, etc. For the most part, DNA fragments will be 

15 at least about 25 nt in length, usually at least about 30 nt, 
more usually at least about 50 nt. For use in amplification 
reactions, such as PCR, a pair of primers will be used. The 
exact composition of the primer sequences is not critical to 
the invention, but for most applications the primers will 
hybridize to the subject sequence under stringent conditions, 
as known in the art. It is preferable to chose a pair of primers 
that will generate an amplification product of at least about 
50 nt, preferably at least about 100 nt. Algorithms for the 
selection of primer sequences are generally known, and are 

25 available in commercial software packages. Amplification 
primers hybridize to complementary strands of DNA, and 
will prime towards each other. Amplification primers of 
interest include the intron sequences flanking each exon, as 
shown in the examples, which may lie immediately outside 

^0 of the coding sequence, or may span the actual junction. Use 
of such primers allows specific amplification of the exon 
sequence from genomic DNA. 

The subject PC-1^ genes and associated sequences are 
isolated and obtained in substantial purity, generally as other 

'^^ than an intact mammalian chromosome. Usually, the DNA 
will be obtained substantially free of other nucleic acid 
sequences that do not include a PC-1 sequence or fragment 
thereof, generally being at least about 50%, usually at least 
about 90% pure and are typically "recombinant", i.e. flanked 
by one or more nucleotides with which it is not normally 
associated on a naturally occurring chromosome. 

PC-1 Polypeptides 

45 The subject nucleic acid compositions may be employed 
for producing PC-1^ protein, or fragments thereof that 
encompass a polymorphisms of interest, e.g. [0121] PC-1. 
For expression, an expression cassette may be employed, 
providing for a transcriptional and translational initiation 

5p region, which may be inducible or constitutive, the coding 
region under the transcriptional control of the transcriptional 
initiation region, and a transcriptional and translational 
termination region. Various transcriptional initiation regions 
may be employed which are functional in the expression 

55 host. 

llie peptide may be expressed in prokaryotes or eukary- 
otes in accordance with conventional ways, depending upon 
the purpose for expression. For large scale production of the 
protein, a unicellular organism or cells of a higher organism, 

50 e.g. eukaryotes such as vertebrates, particularly mammals, 
may be used as the expression host, such as E. coli, B, 
subdlis, S. cerevisiae^ and the like. In many situations, it may 
be desirable to express the subject PC-1 gene in a mamma- 
lian host, whereby the PC-1 gene product will be 

65 glycosylated, and secreted. 

With the availability of the protein in large amounts by 
employing an expression host, the protein may be isolated 



us 6,465,185 Bl 

7 8 

and purified in accordance with conventional ways. Alysate range of insulin sensitivity in normal individuals, some of 

may be prepared of the expression host and the lysate whose values overlap with similar values in people with 

purified using HPLC, exclusion chromatography, gel diabetes. Therefore, one cannot distinguish between nondia- 

electrophoresis, affinity chromatography, or other purifica- betic and diabetic individuals on the basis of measures of 

tion technique. The purified protein will generally be at least 5 insulin resistance. 

about 80% pure, preferably at least about 90% pure, and may The most widely accepted research method or 'gold 

be up to and including 1(K)% pure. By pure is intended free standard' is the cuglycemic insulin clamp technique. With 

of other proteins, as well as of cellular debris. 'his procedure, exogenous insulin is infused, so as to main- 

TTie polypeptide is used for the production of antibodies, " constant plasma insulin level above fasting, while 

where short fragments provide for amibodies specific for the S'"«>^ ^^^Vi' ^^asal level by infus.ng glucose at 

particular polypeptide, and larger fragments allow for the ^'^^mg rates. This glucose mfus.on is de hvered via an 

production of antibodies over the surface of the protein. ">dwelhng catheter at a rate based on plasma glucose 

Antibodies may be raised to the normal or insulin resisiam "7^, ^ I ^ ^ 7 

forms of PC-1. Of particular interest are antibodies that ^^""^ ''^'f^ g^"«'^^ '"^^^'^n "'^ ^ mcreasedto 

specifically recognize the insulin resistant forms of the 15 return plasma glucose to basal levels and vice versa. Tlie 

protein, i.e. the antibodies do not bind to the normal form. ^ount of glucose m used over time (M value) is an index 

Also of interest are antibodies that recognize the soluble f f °" f'"^'"*^ metabol^m. The more glucose 

forms of the protein. Antibodies may be raised to isolated '° "^^"^^^ f." ■""/.^ 

peptides corresponding to these mutations, or to the native '^e patient is to msu m. Conversely, the insulm-resislant 

protein, e.g. by immunization with cells expre.s.sing PC-1. 2« Pf ''"I requires much less glucose to main am basa plasma 

immunization with liposomes containing PC-1, etc. Such S'""*^ ^^"f^- '^f '^^ '"f 'Y^' "jeiaboUsm can 

^^f U^A-^^ . „^^A,i L *u^. ^•^^^c.v be assessed in the absence of the confounding e fleets oi 

antibodies are usetul in therapy and diagnosis. . , • , ■ , • t- 

, . , hvpoglycemic counterregulation, endogenous insulin 
Antibodies are prepared in accordance with conventional s^„etion, or variable levels of hyperglycemia, and multiple 
ways, where the expressed polypeptide or protein is used as -^^^^ ^^^j^^^ ^e assessed by using isotopes, including 
an mimunogen, by itse^ or conjugated to known immuno- ^gulation of glucose uptake and production, inhibition of 
genic earners, e.g. KLIl, pre-S lIBsAg other viral or Upoiysis^ and changes in protein metabolism. 
eukaryotK proteins, or the like. Various adjuvants rnay be ^ alternative is the minimal model. With this procedure, 
employed, with a series of mjections, as appropnate. For , J i- ♦ir^ -jti 
*^ % ' , J. ^ . • - glucose and insulin are sampled frequently from an indwell- 
monoclonal antibodies, after one or more booster injections, «„tu^*^, ^i,.^«o/t^i^..o«^^ »^of. ^u^ 
, - 1 . J \. 1 . . 1* J J 30 mg catheter dunng an mtravenous glucose tolerance test; the 
the spleen is isolated, the splenocytes immortalized, and .j-. . jiuu 

*^ J ^ , • . . J , J 'in. • results arc entered into a computer model, which generates 

then screened for high aftimty antibody binding^lTie mimor- ^ ^ .^^^ ^ Jj^ sensitivity (called Si). The 

talized cells, e.g. hybridomas, producing the desired anti- . , /Afn\ ■ ♦ i • i 

, , . 1 , in r .L 1 • acute msuliD release (AIR) in response to glucose is also 

bodies may then be expanded. For further description, see . • j u * ; -ru- ^ - i- 

, , I , ^ , L 4 r L . 1^ I Ti 1 1 determined by the test. This measure of insuhn resistance 

Monoclonal Antibodies: A laboratory MarmaLtidLTX^ , , , • • 

J ^ u o . 11 i_ T i_ . • ij o • correlates reasonably well with the euglycemic insulin 

Lane eds., Cold Spnng Harbor Laboratories, Cold Spring " , j u u . j . « • 

TI ^T^r ^naa J • J nxTA j« *i_ clamp lu nondiabctic subjccts. Its accuracy dctenorates in 

Harbor, N.Y., 1988. If desired, the mRNA encoding the u * u ,u ■ a- * ^ r 

. 11- i_. L • u • I * J J . • ju diabetes because the immediate plasma msulin response to 

heavy and hght chains may be isolated and mutagemzed by . , , ,, j u i t-u c ii* i 

, / . , L J f i_ • • J / the glucose challenge is dirmnished. Therefore, additional 

cloning mE. coh, and the heavy and light chains mixed to ^ j i . i • r i i u 

- - ^ ^ ' . J_ ^. . ^ maneuvers are needed to raise plasma msulin levels, such as 

further enhance the afinnity of the antibody. Alternatives to . .u * •» ■ i- ,.u c«u 

^ . J J- • . .-1. J- 40 giving tolbutamide or exogenous insulin in the course of the 
m vivo immunization as a method of raising antibodies 

include blading to phage display libraries, usually in con- ^ . , ^ ..... 

junction with in vitro affinity mahiration. , practical way of assessing insuhn resistance is 

the homeostasis model assessment (HOMAIR), involving 

Phenotypic Indications fasting insulin and glucose levels. This value is calculated as 

45 fasting plasma insulin (///ml)xfasting plasma glucose 

Insulin resistance is an essential feature of a great variety (mraol/L)/22.5 (Matthews el al. (1985) Diabelologia. 

of clinical disorders in addition to diabetes, including coro- 28:412-9). ITie steady-state basal plasma glucose and iasu- 

nary artery disease, hypcrlipidemia, obesity and hypcrtcn- Hn concentrations are determined by their interaction in a 

sion. Individuals with non-insulin dependent diabetes have feedback loop. A computer-solved model is been used to 

insulin resistance in peripheral tissues. They have a subnor- 50 predict the homeoslatic concentrations which arise from 

mal glucose utilization in skeletal muscle, where glucose varying degrees beta-cell deficiency and insulin resistance, 

transport across the cell membrane of skeletal muscle is the Comparison of a patient's fasting values with the moders 

rate limiting step in glucose metabolism. In adipose and predictions allows a quantitative assessment of the contri- 

muscle cells, insulin stimulates a rapid and dramatic butions of insulin resistance and deficient beta-cell function 

increase in glucose uptake, primarily by promoting the 55 to the fasting hyperglycaemia. The estimate of insulin resis- 

redistribution of the GLUT4 glucose transporter from its tance obtained by homeostasis model assessment correlates 

intracellular storage site to the plasma membrane. Impaired with estimates obtained by use of the euglycaemic clamp, 

glucose tolerance (IGT) is associated with a normal fasting the fasting insulin concentration, and the hyperglycaemic 

blood glucose but an elevated postprandial blood sugar clamp. I'he lower limit of the top quintile of H0MA(1R) 

between 7.8 and 11 mmol/L (140 and 199 mg/dL). Some 50 distribution (i.e. 2.77) in nonobese subjects with no meta- 

patients with IGT are hyperinsulinimic, and progress to bolic disorders has been chosen as the threshold for insulin 

NIDDM. resistance in some studies (Bonora et al. (1998) Diabetes 

The response to insulin has been measured by a number 47:1643-9). The resuhs of this study documented that 1) in 

of different methods, and insulin resistance has been quan- hypertriglyceridemia and a low HDL cholesterol state, insu- 

tified by a number of different indices. A variety of proce- 65 lin resistance is as common as in NIDDM, whereas it is less 

durcs have been developed to detect the presence of insulin frequent in hypercholesterolemia, hyperuricemia, and 

resistance. Using any of these techniques, there is a wide hypertension; 2) the vast majority of subjects with multiple 



us 6,4( 

9 

metabolic disorders are insulin resistant; 3) in isolated 
hypercholesterolemia, hyperuricemia, or hypertension, insu- 
lin resistance is not more frequent than can be expected by 
chance alone; and 4) in the general population, insulin 
resistance can be found even in the absence of any major 
metabolic disorders. 

The measurement of insulin concentration can be done in 
the overnight fasted condition, since in the postprandial 
state, glucose levels are changing rapidly and the variable 
levels of glucose confound the simultaneous measure of 
insulin levels as an index of insulin action. There is a 
significant correlation between fasting insulin levels and 
insulin action as measured by the clamp technique. Very 
high plasma insulin values in the setting of normal glucose 
levels are very likely to reflect insulin resistance. As indi- 
viduals develop diabetes, plasma glucose increases and 
plasma insulin decreases and so the plasma insulin level no 
longer reflects only insulin resistance because it becomes 
influenced by the appearance of a P-cell defect and hyper- 
glycemia. 

Detection of PC-1 Associated Insulin Resistance 

DNA from a patient having insulin resistance, as 
described above, suspected of association with aberrant 
PC-1 is analyzed for the presence of an IR polymorphism. 
Genetic characterization analyzes DNA or RNA, from any 
source, e.g. skin, cheek scrapings, blood samples, etc. The 
nucleic acids are screened for the presence of an insulin 
resistant polymorphism, e.g. SEQ ID N0:3, as compared to 
a normal sequence (SEQ ID N0:1, SEQ ID N0:2). 

A number of methods are available for analyzing nucleic 
acids for the presence or absence of a specific sequence. 
Where large amounts of DNA are available, genomic DNA 
is used directly. Analysis of genomic DNA may use whole 
chromosomes or fractionated DNA, e.g. Southern blots, etc. 
Comparative Genomic Hybridization (CGH), as described 
in U.S. Pat. No. 5,665,549, provides methods for determin- 
ing the relative number of copies of a genomic sequence. 
The intensity of the signals from each labeled subject 
nucleic acid and/or the difl*erences in the ratios between 
different signals from the labeled subject nucleic acid 
sequences are compared to determine the relative copy 
numbers of the nucleic acid sequences as a function of 
position along the reference chromosome spread. Other 
methods for fluorescence in situ hybridization are known in 
the art, for a review, see Fox et al. (1995) Clin Chem 
41(11): 1554-1559. 

Alternatively, the region of interest is cloned into a 
suitable vector and grown in suflScient quantity for analysis. 
Cells that express PC-1 may be used as a source of mRNA, 
which may be assayed directly or reverse transcribed into 
cDNA for analysis. The nucleic acid may be amplified by 
conventional techniques, such as the polymerase chain reac- 
tion (PCR), to provide sufficient amounts for analysis. The 
use of the polymerase chain reaction is described in Saiki, et 
al. (1985) Science 239:487, and a review of techniques may 
be found in Sambrook, et al. Molecular Cloning: A Labo- 
ratory Manual, CSH Press 1989, pp. 14. 2-14.33. 
Alternatively, various methods are known in the art that 
utilize oligonucleotide ligation as a means of detecting 
polymorphisms, for examples see Riley et al. (1990) NAM. 
18:2887-2890; and Delahunty et al. (1996)kwi. /. Hum. 
Genet. 58:1239-1246. 

A detectable label may be included in an amplification 
reaction. Suitable labels include fluorochromes, e.g. fluo- 
rescein isothiocyanate (FITC), rhodamine, Texas Red, 
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phycoerythrin, allophycocyanin, 6-carboxyfluorescein 
(6-FAM), 2',7'-dimethoxy-4',5'-dichloro-6- 
carboxyfluorescein (JOE), 6-carboxy-X-rhodamine (ROX), 
6-carboxy-2',4*,7',4,7-hexachlorofluorescein (HEX), 

5 5-carboxyfluorescein (5-FAM) or N,N,N',N'-tetramethyl-6- 
carboxyrhodamine (TAMRA), radioactive labels, e.g. ^^p, 
^^S, ^H; etc. ITie label may be a two stage system, where the 
amplified DNA is conjugated to biotin, haptens, etc. having 
a high aflinity binding partner, e.g. avidin, specific 

10 antibodies, etc., where the binding partner is conjugated to 
a detectable label. The label may be conjugated to one or 
both of the primers. Alternatively, the pool of nucleotides 
used in the amplification is labeled, so as to incorporate the 
label into the amplification product. 

The sample nucleic acid, e.g. genomic DNA, amplifica- 
tion product or cloned fragment, is analyzed by one of a 
number of methods known in the art. The nucleic acid may 
be sequenced by didcoxy or other methods, and the sequence 
of bases compared to a wild-type PC-1 sequence. Hybrid- 
ization with the variant sequence may also be used to 
determine its presence, by Southern blots, dot blots, etc. The 
hybridization pattern of a control and variant sequence to an 
array of ohgonucleotide probes immobilised on a solid 
support, as described in U.S. Pat. No. 5,445,934, or in 
WO95/35505, may also be used as a means of detecting the 
presence of variant sequences. Single strand conformational 
polymorphism (SSCP) analysis, denaturing gradient gel 
electrophoresis (DGGE), and heteroduplex analysis in gel 
matrices are used to detect conformational changes created 
by DNA sequence variation as alterations in electrophoretic 
mobility. 

Alternatively, where a polymorphism creates or destroys 
a recognition site for a restriction endonuclease, the sample 

35 is digested with that endonuclea.se, and the products size 
fractionated to determine whether the fragment was 
digested. Fractionation is performed by gel or capillary 
electrophoresis, particularly acrylamide or agarose gels. The 
[Q] PC-1 allele has an Avail site that is not present in the [K] 

40 PC-1 allele, and this difference may be exploited for genetic 
screening. 

Changes in the promoter or enhancer sequence that may 
affect expression levels of PC-1 can be compared to expres- 
sion levels of the normal allele by various methods known 
in the art. Methods for determining promoter or enhancer 
strength include quantitation of the expressed natural pro- 
tein; insertion of the variant control element into a vector 
with a reporter gene such as P-galactosidase, luciferase, 
chloramphenicol acetyltransferase, etc. that provides for 
convenient quantitation; and the like. 

Diagnostic screening may also be performed for poly- 
morphisms that are genetically linked to a predisposition for 
PC-1 associated insulin resistance, particularly through the 

55 use of microsatellite markers, e.g. the variable repeat in 
intron 3, or single nucleotide polymorphisms, e.g. the 3' 
UTR polymorphisms. Frequently the microsatellite poly- 
morphism itself is not phenotypically expressed, but is 
linked to sequences that result in a disease predisposition. 

60 However, in some ca.ses the micro.satellite sequence iLself 
may aflecl gene expression. Microsatellite linkage analysis 
may be performed alone, or in combination with direct 
detection of polymorphisms, as described above. The use of 
microsatellite markers for genotyping is well documented. 

65 For examples, sec Mansfield et al. (1994) Genomics 
24:225-233; Zieglc et al. (1992) Genomics 14:1026-1031; 
Dib et al., supra. 
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Microsatellite loci that are useful in the subject methods 
have the general formula: 

where U and U' are non- repetitive flanking sequences that 
uniquely identify the particular IcKus, R is a repeat motif, 
and n is the number of repeats. Ilie repeat motif is at least 
2 nucleotides in length, up to 7, usually 2-4 nucleotides in 
length. Repeats can be simple or complex. The flanking 
sequences U and U' uniquely identify the microsatellite 
locus within the human genome. U and U' are at least about 
18 nucleotides in length, and may extend several hundred 
bases up to about 1 kb on either side of the repeat. Within 
U and U', sequences are selected for amplification primers. 
The exact composition of the primer sequences are not 
critical to the invention, but they must hybridize to the 
flanking sequences U and U*, respectively, under stringent 
conditions. Criteria for selection of amplification primers are 
as previously discussed. To maximize the resolution of size 
diflerences at the locus, it is preferable to chose a primer 
sequence that is close to the repeat sequence, such that the 
total amplification product is between 100-500 nucleotides 
in length. 

The number of repeats at a specific locus, n, is polymor- 
phic in a population, thereby generating individual differ- 
ences in the length of DNA that lies between the amplifi- 
cation primers. The number will vary from at least 1 repeat 
to as many as about 100 repeats or more. 

The primers are used to amplify the region of genomic 
DNA that contains the repeats. Conveniently, a detectable 
label will be included in the amplification reaction, as 
previously described. Multiplex amplification may be per- 
formed in which several sets of primers are combined in the 
same reaction tube. This is particularly advantageous when 
limited amounts of sample DNA are available for analysis. 
Conveniently, each of the sets of primers is labeled with a 
different fluorochrome. 

After amplification, the products are size fractionated. 
Fractionation may be performed by gel electrophoresis, 
particularly denaturing acryl amide or agarose gels. A con- 
venient system uses denaturing polyacrylamide gels in com- 
bination with an automated DNA sequencer, see Hunkapillar 
et al. (1991) Science 254:59-74. The automated sequencer is 
particularly useful with multiplex amplification or pooled 
products of separate PCR reactions. Capillary electrophore- 
sis may also be used for fractionation. A review of capillary 
electrophoresis may Ix; found in Landers, et al. (1993) 
BioTechniques 14:98-111. The size of the amplification 
product is proportional to the number of repeats (n) that are 
present at the locus specified by the primers. The size will be 
polymorphic in the population, and is therefore an allelic 
marker for that locus. 

Screening for polymorphisms in PC-1 may be based on 
the functional or antigenic characteristics of the protein. 
Protein truncation assays are useful in detecting deletions 
that may affect the biological activity of the protein. Various 
immunoassays designed to detect polymorphisms in PC-1 
proteins may be used in screening. Where many diverse 
genetic mutations lead to a particular disease phenotype, 
functional protein assays have proven to be elfective screen- 
ing tools, for example by detecting the specilic phosphatase 
activity on a PC-1 substrate. Alternatively, changes in elec- 
trophoretic mobiUty may be used. 

Antibodies specific for an PC-1^ polymorphism may be 
used in staining or in immunoassays. Samples, as used 
herein, include cells, e.g. biopsy samples, biological fluids 
such as semen, blood, cerebrospinal fluid, tears, saliva, 
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lymph, dialysis fluid and the like; organ or tissue culture 
derived fluids; and fluids extracted from physiological tis- 
sues. Also included in the term are derivatives and fractions 
of such fluids, llie celts may be dissociated, in the case of 

5 solid tissues, or tissue sections may be analyzed. Alterna- 
tively a lysate of the cells may be prepared. 

Diagnosis may be performed by a number of methods to 
determine the absence or presence or altered amounts of 
normal or PC-l'* in patient cells. For example, detection may 
utilize staining of cells or histological sections, performed in 
accordance with conventional methods. Cells are permeabi- 
lized to stain cytoplasmic molecules. The antibodies of 
interest are added to the cell sample, and incubated for a 
period of time sufficient to allow binding to the epitope, 
usually at least about 10 minutes. The antibody may be 

1^ labeled with radioisotopes, enzymes, fluorescers, 
chemiluminescers, or other labels for direct detection. 
Alternatively, a second stage antibody or reagent is used to 
amplify the signal. Such reagents are well known in the art. 
For example, the primary antibody may be conjugated to 

20 biotin, with horseradish peroxidase-a>njugated avidin added 
as a second stage reagent. Alternatively, the secondary 
antibody conjugated to a flourescent compound, e.g. 
flourescein, rhodamine, Texas red, etc. Final detection uses 
a substrate that undergoes a color change in the presence of 

25 the peroxidase. The absence or presence of antibody binding 
may be determined by various methods, including flow 
cytometry of dissociated cells, microscopy, radiography, 
scintillation counting, etc. 

An alternative method for diagnosis depends on the in 
vitro detection of binding between antibodies and polymor- 
phic FC-1^ in a lysate. Measuring the concentration of 
PC-1^ binding in a sample or fraction thereof may be 
accomplished by a variety of specific assays. A conventional 
sandwich type assay may be used. For example, a sandwich 
assay may first attach PC-l'^ specific antibodies to an 
insoluble surface or support. Patient sample lysates are then 
added to the supports (for example, separate wells of a 
microliter plate) containing antibodies. Preferably, a series 
of standards, containing known concentrations of normal 
and/or PC-l'* is assayed in parallel with the samples or 

40 aliquots thereof to serve as controls. The quantitation may 
then be performed by adding a labeled antibody specific for 
PC-l'^. Other immunoassays are known in the art and may 
find use as diagnostics. Ouchterlony plates provide a simple 
determination of antibody binding. Western blots may be 

45 performed on protein gels or protein spots on fiUcrs, using 
a detection system specific for PC-1 as desired, conveniently 
using a labeling method as described for the sandwich assay. 

Immunoassays may also be used in the detection of 
soluble PC-1 in the plasma of a patient, where quantitative 
and qualitative analysis may be performed. It is found that 
decreased levels of PC-1 in the plasma are associated with 
increased levels in the muscle, therefore a relatively low titer 
is a.ssociated with insulin resistance. In addition, the soluble 
PC-1 may be analyzed for the presence of a predisposing 
polymorphism, e.g. that Q121 protein. 

A kit may be provided for practice of the subject diag- 
nostic methods. Such a kit may contain hybridization probes 
that bind to a polymorphic PC-1^ sequence under hybrid- 
ization conditions where the probe does not bind to a wild 
type PC-1 sequence. Alternatively, antibodies specific for a 
polymorphic PC-1^ polypeptide may be included. Such a kit 
will typically include positive and negative nucleic acid or 
polypeptide controls, and such other buffers and reagents as 
may be necessary to practice the method. 

g5 Modulation of Gene Expression 

The PC-1 genes, gene fragments, or the encoded protein 
or protein fragments are useful in gene therapy to treat 
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disorders associated with PC-1 insulin resistance. Expres- 
sion vectors may be used to introduce a PC-1 gene into a 
cell. Such vectors generally have convenient restriction sites 
located near the promoter sequence to provide for the 
insertion of nucleic acid sequences. Transcription cassettes 
may be prepared comprising a transcription initiation region, 
the target gene or fragment thereof, and a transcriptional 
termination region. The transcription cassettes may be intro- 
duced into a variety of vectors, e.g. plasmid; retrovirus, e.g. 
lentivirus; adenovirus; and the like, where the vectors are 
able to transiently or stably be maintained in the cells, 
usually for a period of at least about one day, more usually 
for a period of at least about several days to several weeks. 

The gene or PC-1 protein may be introduced into tissues 
or host cells by any number of routes, including viral 
infection, microinjection, or fusion of vesicles. Jet injection 
may also be used for intramuscular administration, as 
described by Furth et al. (1992) Ana IB iochem 205:365-368. 
The DNA may be coated onto gold microparticles, and 
delivered intrademially by a particle bombardment device, 
or "gene gun" as described in the literature (see, for 
example. Tang et al. (1992) Nature 356:152-154), where 
gold microprojectiles are coaled with PC-1 protein or 
nucleic acids encoding PC-1, then bombarded into skin 
cells. 

Antisense molecules can be used to down-regulate 
expression of PC-1 in cells. The anti -sense reagent may be 
antisense oligonucleotides (ODN), particularly synthetic 
ODN having chemical modifications from native nucleic 
acids, or nucleic acid constructs that express such anti-sense 
molecules as RNA. ITie antisense sequence is complemen- 
tary to the mRNA of the targeted gene, and inhibits expres- 
sion of the targeted gene products. Anlisease molecules 
inhibit gene expression through various mechanisms, e.g. by 
reducing the amount of mRNA available for translation, 
through activation of RNAse H, or steric hindrance. One or 
a combination of antisense molecules may be administered, 
where a combination may comprise multiple different 
sequences. 

Alternatively, the antisense molecule is a synthetic oligo- 
nucleotide. Antisense oligonucleotides will generally be at 
least about 7, usually at least about 12, more usually at least 
about 20 nucleotides in length, and not more than about 500, 
usually not more than about 50, more usually not more than 
about 35 nucleotides in length, where the length is governed 
by efficiency of inhibition, specificity, including absence of 
cross-reactivity, and the like. It has been found that short 
oligonucleotides, of from 7 to 8 bases in length, can be 
strong and selective inhibitors of gene expression (see 
Wagner et al. (1996) Nature Biotechnology 14:840-844). 

A specific region or regions of the endogenous sense 
strand mRNA sequence is chosen to be complemented by 
the antisense sequence, preferably encompassing the [Q121] 
PC-1 mutation. Selection of a specific sequence for the 
oligonucleotide may use an empirical method, where several 
candidate sequences are assayed for inhibition of expression 
of the target gene in an in vitro or animal model. A 
combination of sequences may also be used, where several 
regions of the mRNA sequence are selected for antisense 
complementation. 

Nucleic acids may be naturally occurring, e.g. DNA or 
RNA, or may be synthetic analogs, as known in the art. Such 
analogs may be preferred for use as probes because of 
superior stabihty under assay conditions. Modifications in 
the native structure, including alterations in the backbone, 
sugars or heterocyclic bases, have been shown to increase 
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intracellular stability and binding affinity. Among useful 
changes in the backbone chemistry are phosphorothioates; 
phosphorodithioates, where both of the non-bridging oxy- 
gens are substituted with sulfur; phosphoroamidites; alkyl 

5 phospholriesters and boranophosphates. Achiral phosphate 
derivatives include 3'-0*-5'-S-phosphorothioate, 3'-S-5'-0- 
phosphorothioate, 3'-CH2-5'-0-phosphonate and 3'-NH-5'- 
O-phosphoroamidate. Peptide nucleic acids replace the 
entire ribose phosphodiester backbone with a peptide link- 

10 ^S^* 

Sugar modifications are also used to enhance stability and 
affinity. The a-anomer of deoxyribose may be used, where 
the base is inverted with respect to the natural b-anomer. The 
2'-0H of the ribose sugar may be altered to form 2'-0- 

15 methyl or 2'-0-allyl sugars, which provides resistance to 
degradation without comprising affinity. 

Modification of the heterocyclic bases must maintain 
proper base pairing. Some useful substitutions include deox- 
yuridine for deoxythymidine; 5-methyl-2'-deoxycytidine 

20 and 5-bromo-2'-deoxycytidine for deoxycytidine. 
5-propynyl-2'-deoxyuridine and 5-propynyl-2'- 
deoxycytidine have been shown to increase affinity and 
biological activity when substituted for deoxythymidine and 
deoxycytidine, respectively. 

25 The antisense molecules and/or other inhibitory agents are 
administered by contact with cells under conditions that 
permit entry. The molecules may be provided in solution or 
in any other pharmacologically suitable form for 
administration, such as a liposome suspension, 'lliere are 

30 many delivery methods known in the art for enhancing the 
uptake of nucleic acids by cells. Useful delivery systems 
include Sendai virus-liposome delivery systems (see Rapa- 
port and Shai (1994) J. Biol. Chem. 269:15124-15131), 
cationic liposomes, polymeric deUvery gels or matrices, 

.35 porous balloon catheters (as disclosed by Shi ct al. (1994) 
Circulation 90:955-951; and Shi et al. (1994) Gene Therapy 
1:408—414), retrovirus expression vectors, and the like. 

The use of liposomes as a delivery vehicle is one method 
of interest. The liposomes fuse with the cells of the target 

40 site and deliver the contents of the lumen intracellularly. The 
liposomes are maintained in contact with the cells for 
sufficient time for fusion, using various means to maintain 
contact, such as isolation, binding agents, and the like. 
Liposomes may be prepared with purified proteins or pep- 

45 tides that mediate fusion of membranes, such as Sendai virus 
or influenza virus, etc. The lipids may be any useful com- 
bination of known liposome forming lipids, including cat- 
ionic lipids, such as phosphatidylcholine. The remaining 
lipid will normally be neutral lipids, such as cholesterol, 

50 phosphatidyl serine, phosphatidyl glycerol, and the like. 
The therapeutic agents are administered at a dose effective 
to reduce expression level of PC-1^ at least about 50%, more 
usually at least 80%, and preferably to substantially unde- 
tectable levels. 

55 

Genetically Modified Cells and Animals 

The subject nucleic acids can be used to generate trans- 
genic animals or site specific gene modifications in cell 
lines. 1 ransgenic animals may be made through homologous 

60 recombination. Alternatively, a nucleic acid construct is 
randomly integrated into the genome. Vectors for stable 
integration include plasmids, retroviruses and other animal 
viruses, YACs, and the like. The modified cells or animals 
are useful in the study of PC-1 function and regulation. A 

65 detectable marker, such as lac Z may be introduced into the 
PC-1 locus, where uprcgulation of PC-1 expression will 
result in an easily detected change in phenolype. 
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DNAconstructs for homologous recombination will com- interaction with proteins, particularly hydrogen bonding, 

prise at least a portion of a polymorphic gene with the and typically include at least an amine, carbonyl, hydroxyl 

desired genetic modification, and will include regions of or carboxyl group, preferably at least two of the functional 

homplogy to the target locus. DNA constructs for random chemical groups. The candidate agents often comprise cycli- 

integration need not include regions of homology to mediate 5 cal carbon or heterocyclic structures and/or aromatic or 

recombination. Conveniently, markers for positive and polyaromatic structures substituted with one or more of the 

negative selection are included. Methods for generating cells above functional groups. Candidate agents are also found 

having targeted gene modifications through homologous among biomolecules including peptides, saccharides, fatty 

recombination are known in the art. For various techniques acids, steroids, purines, pyrimidines, derivatives, structural 

for transfecting mammalian cells, see Keown et al. (1990) lo analogs or combinations thereof. 

Methods in Enzymology 185:527-537. Candidate agents are obtained from a wide variety of 
For embryonic stem (ES) cells, an ES cell line may be sources including libraries of synthetic or natural com- 
employed, or ES cells may be obtained freshly from a host, pounds. For example, numerous means are available for 
e.g. mouse, rat, guinea pig, etc. Such cells are grown on an random and directed synthesis of a wide variety of organic 
appropriate fibroblast-feeder layer or grown in the presence compounds and biomolecules, including expression of ran- 
of leukemia inhibiting factor (LIP). When ES cells have domized oligonucleotides and oligopeptides. Alternatively, 
been transformed, they may be used to produce transgenic libraries of natural compounds in the form of bacterial, 
animals. After transformation, the cells are plated onto a fungal, plant and animal extracts are available or readily 
feeder layer in an appropriate medium. Cells containing the produced. Additionally, natural or synthetically produced 
construct may be detected by employing a selective 20 libraries and compounds are readily modified through con- 
medium. After sufficient time for colonies to grow, they are ventional chemical, physical and biochemical means, and 
picked and analyzed for the occurrence of homologous may be used to produce combinatorial libraries. Known 
recombination or integration of the construct. Those colo- pharmacological agents may be subjected to directed or 
nies that are positive may then be used for embryo manipu- random chemical modifications, such as acylation, 
lation and blastocyst injection. Blastocysts are obtained 25 alkylation, csterification, amidification, etc. to produce 
from 4 to 6 week old superovulated females. The ES cells structural analogs, 

are trypsinized, and the modified cells are injected into the Where the screening assay is a binding assay, one or more 
blastocoel of the blastocyst. After injection, the blastocysts of the molecules may be joined to a label, where the label 
are returned to each uterine horn of pseudopregnant females. can directly or indirectly provide a detectable signal. Various 
Females are then allowed lo go to term and the resulting ^0 labels include radioisotopes, fluorescers, chemiluminescers, 
litters screened for mutant cells having the construct. By enzymes, specific binding molecules, particles, e.g. mag- 
providing for a different phenolype of the blastocyst and the netic particles, and the like. Specific binding molecules 
ES cells, chimeric progeny can be readily detected. include pairs, such as biotin and streptavidin, digoxin and 
The chimeric animals are screened for the presence of the antidigoxin etc. For the specific binding members, the 
modified gene and males and females having the modifica- complementary member would normally be labeled with a 
tion are mated to produce homozygous progeny. The trans- molecule that provides for detection, in accordance with 
genie animals may be any non-human mammal, such as known procedures. 

laboratory animals, domestic animals, etc. The transgenic a variety of other reagents may be included in the 

animals may be used in functional studies, drug screening, ^ screening assay. These include reagents like salts, neutral 

etc., e.g. to determine the effect of a candidate drug on proteins, e.g. albumin, detergents, etc that are used to 

insulin resistance. facilitate optimal protein-protein binding and/or reduce non- 

. specific or background interactions. Reagents that improve 

Drug Screening Assays efficiency of the assay, such as protease inhibitors, 

Drug screening identifies agents inhibit or otherwise ^5 nuclease inhibitors, anti-microbial agents, etc. may be used, 

modulate PC-1 function in cells. Of particular interest are 'ITie mixture of components are added in any order that 

screening assays for agents that have a low toxicity for provides for the requisite binding. Incubations are per- 

human cells. A wide variety of assays may be used for this formed at any suitable temperature, typically between 4 and 

purpose, including labeled in vitro protein-protein binding 40"* C. Incubation periods are selected for optimum activity, 

assays, electrophoretic mobility shift assays, immunoassays 5^ but may also be optimized to facilitate rapid high-throughput 

for protein binding, and the like. The purified protein may screening. Typically between 0.1 and 1 hours will be suffi- 

also be used for determination of three-dimensional crystal cient. 

structure, which can be used for modeling intermolecular The compounds having the desired pharmacological 

interactions, transporter function, etc. activity may be administered in a physiologically acceptable 

llie term "agent" as used herein describes any molecule, 55 carrier to a host for treatment of insulin resistance or 

e.g. protein or pharmaceutical, with the capability of altering hyperglycemia attributable to a defect in PC-1 function. The 

or mimicking the physiological function of PC-1. Generally compounds may also be used to inhibit PC-1 function in 

a plurality of assay mixtures are run in parallel with different resistance to insulin, etc. The inhibitory agents may be 

agent concentrations to obtain a differential response to the administered in a variety of ways, orally, topically, parcnler- 

various concentrations. Typically, one of these concentra- 60 ^l^y ^ subcutaneously, intraperitoneally, by viral infection, 

tions serves as a negative control, i.e. at zero concentration intravascular! y, etc. Topical treatments are of particular 

or below the level of detection. interest. Depending upon the manner of introduction, the 

Candidate agents encompass numerous chemical classes, compounds may be formulated in a variety of ways. The 

though typically they are organic molecules, preferably concentration of therapeutically active compound in the 

small organic compounds having a molecular weight of 65 formulation may vary from about 0.1-100 wt. %. 

more than 50 and less than about 2,500 daltons. Candidate The pharmaceutical compositions can be prepared in 

agents comprise functional groups necessary for structural various forms, such as granules, tablets, pills, suppositories, 
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capsules, suspensions, salves, lotions and the like. Pharma- 
ceutical grade organic or inorganic carriers and/or diluents 
suitable for oral and topical use can be used to make up 
compositions containing the therapeutical ly-active com- 
pounds. Diluents known to the art include aqueous media, 5 
vegetable and animal oils and fats. Stabilizing agents, wet- 
ting and emulsifying agents, sails (or varying the osmotic 
pressure or buffers for securing an adequate pH value, and 
skin penetration enhancers can be used as auxiliary agents. 

'ITie following examples are put forth so as to provide i«J 
those of ordinary skill in the art with a complete disclosure 
and description of how to make and use the subject 
invention, and are not intended to limit the scope of what is 
regarded as the invention. Efforts have been made to ensure 
accuracy with respect to the numbers iised (e.g. amounts, 
temperature, concentrations, etc.) but some experimental 
errors and deviations should be allowed for. Unless other- 
wise indicated, parts are parts by weight, molecular weight 
is average molecular weight, temperature is in degrees 
centigrade; and pressure is at or near atmospheric. 20 

EXPERIMENTAL 

EXAMPLE 1 

Polymorphic Variant of PC-1 Associated with 25 
Insulin Resistance 

Methods 
Subjects 

127 unrelated, healthy, non obese subjects (body mass 
index, BMI, <30 Kg/m^) normotensive (blood 30 
pressure<140/90 mm Hg), normal glucose tolerant (by 
OG'iT') were studied- Plasma insulin levels were measured 
before and during an OGTr that was carried out after 8 days 
on a weight-maintaining diet. Insulin stimulated glucose 
disposal was carried out in a subgroup of 71 subjects by the 35 
euglycaemic, hyperinsulinemic clamp. 

Also studied were 132 type 2 diabetic patients (agc= 
66.5±8.0 yr, 60 male/72 female, BMI=28.9±4.5 Kg/m^) with 
a strong family history of diabetes (one first degree relative 
with type 2 diabetes at the minimum). To minimize the 40 
possible inclusion of individuals affected by late onset type 
1 diabetes, patients were selected on the basis of age of 
diabetes onset ^45 yrs, BMI ^21 Kg/m^ and no need for 
insulin therapy. 

Informed consent was obtained from participants before 45 
entry into the study, which was approved by the local 
research ethics committee. 
Polymorphism Screening 

Overlapping cosmid clones containing the PC-1 gene 
were isolated by screening a human chromosome 6 specific 5U 
genomic library with human fuU length PC-1 cDNA. 
Cosmids were digested with different four base cutter 
restriction enzymes, blotted and hybridized to oligonucle- 
otides designed on the cDNA sequence. Positive fragments 
were cloned and automatically sequenced. Intron-exon junc- 55 
tions were deduced comparing genomic and cDNA 
sequences. 

All exon amplimers, obtained using specific oligonucle- 
otides as primers, were analyzed in 40 unrelated and 
unscreened individuals by Single Strand Conformation 60 
Polymorphism (SSCP) which was performed as follows. 
Amplification reaction products were denatured for 5 min- 
utes al ST C. in 90% formamide, 20 mM EDTA, 10 mM 
NaOH. After denaturation samples were chilled on ice, 
loaded on a native 8%-12% (according to amplimer size) 65 
acrylamide (29:1 Acrylamide-Bisacrylamidc) gel (0.04x20x 
42 cm) in THE and electrophoresed at 10 W constant power 
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for 13-16 hours at room temperature. After the 
electrophoresis, gels were treated by silver staining. PGR 
products showing different migration patterns at SSCP were 
cloned in a TA-cloning vector (Stratagene) and four clones 
for each sample were automatically sequenced from both 
ends. 

Exon 4 amplimers were obtained using oligonucleotides 
4Fw [SEQ ID N0:9] (5'-ctgtgttcactttggacatgttg-3') and 4Rv 
[SEQ ID NO: 10] (5'gacgttggaagataccaggttg-3') as primers. 
PCR products were digested using Avail restriction enzyme 
and mn on 12% native poly acrylamide gel for 2 hours at 
500V. After the electrophoresis, gels were stained by silver 
nitrate. On the gel K alleles are displayed as single, uncut, 
bands of 238 bp, whUe 0 alleles are shown as a doublet of 
14S and 90 bp. 

One -hundred-sixty unscreened blood donors were geno- 
typed as background population. All genotypings were per- 
formed in duplicate for each individual and to prevent 
observer bias the investigator was unaware of sample origin. 
Skin Fibroblast Culture and Insulin Receptor Autophospho- 
rylation 

Fibroblast cultures were established from 4-nMii forearm 
skin-punch biopsies. I^^^ insulin binding data were obtained 
by inhibition-competition studies. IR-TK (receptor 
autophosphorylation) was determined exposing cells for 10 
min to increasing insulin (0-100 nM) concentrations. Cells 
were then solubilized in 50 mM Hepes buffer, pH 7.6, 
containing 1% Triton X-100, 1 mM PMSF, 2 mM ortho- 
vanadate and 1% BSA and IR-TK determined. 
PC-1 Content in Muscle Tissue Specimens 

Muscle tissue specimens were obtained from the external 
oblique muscle at elective abdominal surgery 
(cholecystectomy) and were immediately frozen in liquid 
nitrogen. Soluble extracts were prepared from frozen muscle 
tissue and PC-1 content was measured by a specific ELISA 
as previously described and normalized for protein content. 
Statistical Analysis 

Group values arc given as mcan±SD. Student's t-test or 
Mann Whitney U test were used to compare mean values of 
2 groups. One-way ANOVA and both Student-Newman- 
Keuls and Bonferroni t-test were used to compare mean 
values of more than 2 groups. Two-way ANOVA lest was 
used to compare insulin dose-response curves of IR-TK. 
Chi-square test was used to compare allele frequency. 
Results 

The PC-1 gene has been located on chromosome 
6q22-23. Analysis of a YAC contig from the region, allowed 
it to be more finely mapped, to between markers D6S457 
and WI-3398. Only exon 4, which extends from nucleotide 
447 to 571 of the cDNA and codes for an extracellular 
portion of PC-1, showed a polymorphic variant. When 
screened by SSCP analysis and sequencing it revealed a 
frequent first position A-»C transversion at codon 121 
(considering the second in frame ATG as the start codon) 
(FIG. la). This single base change substitutes a glutamine 
for a lysine in a cysteine-rich region of PC-1 (SEQ ID N0:1 
and SEQ ID N0:3, respectively), and creates an Avail 
restriction enzyme recognition site. Avail digestion of exon 
4 amplimers cuts the Q allele PCR fragments, leaving the K 
allele undigested (FIG. 1/?). In 160 uncharacterized blood 
donors, the Q allele frequency was 17.5%, with only 2 QQ 
homozygotes. The observed genotype frequencies were in 
agreement with those predicted by the Hardy-Weimberg 
equilibrium. 

Having identified a PC-1 polymorphism which changes 
both amino acid composition and electric charge, and thus 
with potential biological relevance, we searched for an 
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association with insulin resistance. Accordingly we studied 
127 unrelated, healthy, non obese, normotensive, non dia- 
betic subjects resident in Sicily. As expected, these individu- 
als showed a wide range of plasma insulin levels during 
OGTT, a finding which in the presence of normal glucose 
tolerance, indicates a wide range of insulin sensitivity. These 
data were confirmed by the euglycemic hyperinsulinemic 
glucose clamp, a more quantitative technique for the mea- 
surement of insulin sensitivity. In a subgroup of 71 indi- 
viduals the M values for insulin stimulated glucose disposal 
ranged from 2.34 to 9.62 mg/Kg/min. 

Table 1 siunmarizes the clinical features of these 2 groups. 
Q allele carriers showed higher fasting plasma glucose 
(p<0.001) (Table 1 and FIG. 2a) values. They also showed 
higher plasma insulin values at 60 (p<0.05) and 120 
(p<0.01) minutes during OGTT (FIGS. 2a and 2b). 



TABLE 1 



Genotype 


Clinical 


Characierislics of the subieul studied 


HRI 
(pmol/1) 


Gender 
(MAO 


Age BMI 
(years) (Kg/m^ 


FPG 
(mmol/1) 


KK 


27/18 


36.6 X 2.1 23,8 * 0.5 


4.7 ± 0.1 


49.0 ± 4.0 


(n = 45) 










KQ or QQ 


18/4 


40.3 * 3.1 24.2 ± 0.8 


5.1 ±0.1* 


60.0 ± 8.0 


(n = 22) 











Data are expressed as mean ± SEM. 

"p < 0.01 vs. KK subjects 

BMI - body mass index 

FPG « fasting plasma glucose 

FIRI * fasting immunoreactive insulin 



In the subjects studied by glucose clamp, insulin stimu- 
lated glucose disposal was lower in Q allele carriers when 
compared to KK allele age, sex and BMI matched subjects. 
No difference was observed in insulin levels at steady state 
during clamp studies in the 2 groups (485+165 pmol/1 vs 
460+78). On the average, therefore, Q allele carriers were 
insulin resistant and maintained normal glucose tolerance 
due to compensatory hyperinsulinemia. Mean blood 
pressure, plasma total cholesterol, HDL cholesterol and 
triglyceride levels were not different between the 2 groups. 

Of the 2 subjects with QQ alleles, one was a 35 yr. old 
male who was studied by euglycemic clamp and had the 
second lowest M value (M=2.57 mg/Kg/min) of the all the 
XY males studied. The BMI (28 Kg/m2), blood pressure 
(138/90 mm Hg), and lipid profile (cholesterol/HDL ratio 
being 0.16 and triglycerides 176 mg/dl) were in the upper 
range of the studied individuals. The second QQ subject was 
a 52 yr. old female with BMI (21 Kg/m2) blood pressure, 
and her lipid profile was entirely normal. She did not agree 
to be studied by euglycemic clamp. Both QQ subjects were 
first degree relatives of a type 2 diabetic patient. 

When subjects were subdivided into tertiles according to 
plasma insulin levels at 120 minutes during the OGTT 
(tertile l=low, tertile 2=intermediate, and tertile^S high 
insulin levels). As expected, the mean M value for glucose 
disposal progressively decreased from tertile 1 to tertile 3 
(7.22+0.26 mg/Kg/min, n=21 vs. 5.86+0.28, n=25 and 4.89+ 
0.23, n=25, p<0.001). Q allele frequency was similar in 
subjects from tertiles 1 and 2 (11.7, n=34 and 10.6%, n=33, 
respectively), but it was much higher in tertile 3 insulin 
resistant subjects (29.4%, n=34, p<0.01 when compared to 
the remaining 67 subjects, 11.2%). Also, in 133 type 2 
diabetic patients Q allele frequency was higher (20.8%, 
p<0.01) than in tertile 1 and 2 subjects with no difference 
between obese (BMI>30 Kg/m2, n«90) and non obese 
(n=42) patients (21.1% and 20.2%, respectively). 
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TABLE 2 



5 



Q allele frequency and insulin sensitivity (M value) in subjects divided in 
tertiles according to plasma insulin level at 120 min during CXjTT 
flRI 120 min^ 


Tertiles 
IRI 120 

min range 


1 

(1353-300 pmoIe/1) 


2 

(273-147 pmol/1) 


3 

(140-27 pmol/1) 


Q allele 
frequency % 
M value 

(mg/kg/min) 


29# 
n-22 
4.50 ± 0.36* 

n = 17 


15 
n-23 

5.46 ± 0.30** 

n = 20 


9 

n = 22 
7.21 ± 0.26 

n = 17 



Data are expressed as mean ± SEM. Number of subjects are given in 
25 parenthesis. 

#p < 0.05 vs tertile 3 

*p < 0.01 vs. tertile 2 and 3 

•"p < 0.01 vs. tertile 3 



In order to exclude any a.ssociation of the Q allele variant 

2^ with other changes in PC-1, each of the 25 exoiis from a QQ 
control were sequenced from the start to the stop codon. No 
other base change was detected. 

In order to study IR autophosphorylation activity, cultured 
fibroblasts from 5 Q/K and 5 gender, age and BMI matched 
KK subjects were selected on the basis of a similar FC-1 

25 protein content (50.3+8.7 and 60,8+15.4 ng/0.1 mg protein, 
respectively). Q/K fibroblasts showed a reduced IR auto- 
phosphorylation activity (p<0.01) (FIG. 3). Insulin binding 
to its receptor was studied and no difference in both total 
specific binding (% of bound/total radioactivity=0..52+0.10 

30 per 0.1 mg protein and 0.55-fO.II in Q/K and KK subjects, 
respectively), and IC50(0.27+0.05 nmol/l and 0.26+0.08). 

PC-1 content was not significantly different in muscle 
specimens &om 8 QK and 26 KK sex, age and BMI matched 
subjects (36.5+5.1 ng/mg protein vs. 25.9±2.6 in QK and 

35 KK subjects, respectively. 
Discussion 

The data provided herein demonstrate that a PC-1 gene 
polymorphism (K121Q in cxon 4) is associated with 
decreased insulin sensitivity in healthy non-diabetic indi- 
viduals. Because insulin resistance is a major risk factor for 
the development of type 2 diabetes, Q allele carriers may be 
at higher risk to develop diabetes. This is supported by the 
high Q allele frequency observed in patients with type 2 
diabetes mellitus. No association was observed with BMI 
both in healthy and diabetic individuals. 
45 We previously reported that increased PC-1 content in 
skeletal muscle and adipose tissue is associated with insulin 
resistance. In addition, when cultured cells overexpress 
PC-1 they are insulin resistant secondary to both decreased 
IR tyrosine kinase activity and reduced downstream signal- 
so ing steps. These latter observations indicate that an increased 
PC-1 content may play a role in insulin resistance through 
the inhibition of IR-TK activity. 

PC-1 content is not significantly different in skeletal 
muscle from KQ with respect to KK subjects, indicating that 
55 insulin resistance in Q allele subjects is not due to an 
increased PC-1 protein content. Again, these data suggest 
that structural differences between the 2 variant proteins 
may account for different insulin sensitivity, independent of 
protein content. These data indicate PC-1 is an important 
5n candidate for the genetic regulation of whole body insulin 
sensitivity. PC-1 genotyping can be used for identifying 
individuals at risk of developing insulin resistance, 

EXAMPLE 2 

^5 Fasting Plasma PC-1 and its Regulation by Insulin 

A soluble form of PC-1 is generated by intracellular 
cleavage of its transmembrane domain, and subsequently 
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released by the cell. It is not known whether soluble PC-1 
circulates in human plasma. The possibility of measuring 
PC-1 in human plasma wotild considerably increase the 
feasibility of screening studies. 

A sensitive and specific ELISA was set up, and used to 5 
measure plasma FC-1 concentration before and after a 
2-hour euglycemic hyperinsulinemic clamp in 22 healthy 
control, and 27 subjects affected by diseases known to be 
associated with insulin resistance (i.e. obesity and essential 
hypertension). The obtained results indicate that low fasting lo 
level and abnormal acute regulation by insuHn of plasma 
PC-1 concentration are associated with several feamres of 
the "metabolic syndrome", including abdominal fat 
distribution, high blood pressure and low insulin sensitivity 
on both glucose and lipid metabolism. 15 
Methods 

Plasma PC-1 Measurement 

Wells in Maxisorb plates were precoated overnight incu- 
bation at 40** C. with an aflSnity purified polyclonal antibody 
to PC-1 . After washing with TBvST builder (20 mm Tris, 1 50 20 
mm NaCl, 0.05% Tween-20) to remove unbound antibody, 
wells were blocked with 150 //I TBST containing 1% bovine 
serum albumin (BSA) (30 min at 56° C), and washed again 
with TBST. Then, human plasma (10-30 //I diluted to a total 
volume of 100 /<! with 50 mM HEPES buffer, pH 7.6, 25 
containing 0.05% Tween-20, 1 mM PMSF, 2 mM 
orthovanadate, 1% BSA and 1 mg/ml bacitracin) was added 
to each well and PC-1 was allowed to bind overnight at 4** 
C. After extensive washing with TBST, a biotinylated anti- 
PC-1 monoclonal antibody was added in the 50 mM HEPES 30 
buffer. After 2 hr at 22° C, peroxidase-streptavidin was 
added and 30 min later, wells were washed again with I'BSl' 
and then 100 of biotinyl-tyramide solution was added. 
After 15 min incubation at 22° C, wells were washed with 
TBST and streptavi din-horseradish peroxidase was added 35 
(30 min at 22° C). After further extensive washing, the 
peroxidase activity was determined calorimctrically by add- 
ing 3.3'.5.5'-tetramethylbenzidine (TMB) at a concentration 
of 0.4 g/1 in an organic base, and measuring the absoibance 
at 451 nm. 40 
Muscle PC-1 Measurement 

Muscle tissue specimens were obtained from the external 
oblique muscle at elective abdominal surgery 
(cholecystectomy). After adipose tissue was dissected and 
blood removed, specimens were immediately frozen in 45 
liquid nitrogen. Soluble extracts were prepared from frozen 
muscle tissue as previously described. Briefly, muscle tissue 
(approximately 150 mg) was pulverized under liquid nitro- 
gen and then homogenized in 2 ml buffer (50 mm HEPES, 
150 mm NaQ, 2 mm PMSF, pH 7.6) at 4** C. using a so 
polytron homogenizer for 10 sec. at medium speed. Triton 
X-100 was added to a final concentration of 1%, and the 
homogenates solubilized for 60 min at 4® C. The material 
was centrifuged at lOOK g for 60 min at 4** C. and the 
supematants used for the PC-1 content measurement. 55 
Statistical Analysis 

One way analysis of variance (ANOVA) was utilized 
when means values from 3 groups were compared. Paired 
Student's t test was utilized to compare mean values before 
and after clamp. 60 

Correlation (either "Pearson" if the data was distributed 
normally or "Spearman" if the data was not dLstribiited 
normally) analysis was used to look for numerical relation- 
ship between values. Statistically significant correlations 
were confirmed by linear regression analysis. Stepwise 65 
regression analysis was utilized for multiple correlations. 
Data are given as mean±SEM. 



Results 

Subjects Studied 

TVenty two healthy control and 27 subjects affected by 
either obesity (BMI>28, n-^lO) or essential hypertension 
(mean blood pressure>108 mm Hg, nsl2) or both (n^5) 
were studied. Clinical and metabolic features of the 49 
subjects are shown in Table 3. As expected, insulin 
sensitivity, as indicated by M values derived by euglycemic 
hyperinsulinemic clamp studies, was significantly reduced 
in obese and/or hypertensive patients as compared to normal 
controls. 

TABLE 3 



Control 

mean 
SEM 

Insulin Resistant 

mean 
SEM 



age sex BMI W/H MBP BG IRI M 



37 12/10 23.8 0.83 90 5.1 65 6.2 
2 0.4 0.03 2 0.1 7 0.4 



47 19/8 29.2 0.92 109 5.3 80 4.8 
2 0.8 0.02 3 0.1 7 0.3 



Plasma PC-1 Concentration 

Fasting plasma PC-1 was measured by ELISA. Human 
plasma produced a dilution slope that paralleled the PC-1 
standard. Intra- and inter-assay coefficient of variations 
were<8%. Plasma PC-1 concentration ranged from 1 to 70 
ng/ml with a mean±S>E>of 26.5±2.9 and a median of 24.5. 
No significant difference was observed between plasma 
PC-1 concentration in control (27.7±4.5, n=22) and insulin 
resistant obese and/or hypertensive (25.6±3.9, n-27) sub- 
jects. 

When the 49 subjects were considered together, plasma 
PC-1 concentration was correlated negatively with both 
waist/hip ratio (-0.49, p=0.001) and systolic blood pressure 
(-0.36, p=0.016) and positively (0.40, p=0.01) with the 
ability of insulin to suppress plasma FFA (delta FFA, cal- 
culated by subtracting basal FFA from FFA after the two 
hour euglycemic hyperinsulinemic clamp). Plasma PC-1 
concentration remained significantly correlated with the 
waist/hip ratio also when data were adjusted for BMI and 
sex (p=0.0024), with systolic blood pressure also when data 
were adjusted for sex and age (p=0.019) and with delta FFA 
also when data were adjusted for BMI, sex and waist/hip 
ratio (p-0.037). 

These data demonstrate that PC-1 circulates in human 
plasma and that low plasma PC-1 level is independently 
associated with several features of the "metabolic syn- 
drome" including abdominal fat distribution, high blood 
pressure and, so far as lipid metabolism is concerned, insulin 
resistance. 

Insulin Stimulated Values 

In order to verify whether insulin exerts any effect on 
plasma PC-1, PC-1 was measured after two hour euglycemic 
hyperinsulinemic clamp. Although the mean plasma PC-1 
concentration in the 49 subject after clamp was not different 
as compared to basal plasma PC-1 level (26.3±3.9, vs. 
26.5±4.1), a wide range of the individual effects of insulin 
infusion were observed, from subjects showing an increase 
to subjects showing either no change or a reduction in 
plasma PC-1. When subjects were divided in tertiles accord- 
ing lo their whole body insulin sensitivity on glucose values 
(M values), with the most insulin sensitive in tertile 1 and the 
most resistant in tertile 3, insulin stimulated PC-1 concen- 
trations were significantly higher than basal plasma PC-1 
concentration in subjects from tertile 1 (21.7±5.4 vs. 
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25.8±5.5 before and after clamp, respectively, p=0.01 5, n=l 
6) but not in subjects from tertile 2 (31.7±5.0 vs. 31.9±4.6, 
n.s., n=17) and 3 (25.9±4.7 vs. 20.7±3.7, n=16). Moreover, 
the net effect of insulin on plasma PC-1 concentration (delta 
PC-1, calculated by subtracting basal PC-1 from insulin 5 
stimulated PC-1) was positively correlated with M value 
(0.37, p=0.()()9) and negatively with BMI (-0.37, p=0.(K)9). 
A significant (p=0.005) positive correlation between M and 
delta PC-1 values was observed also when data was adjusted 
for BMI. A similar correlation between M and delta PC-1 lo 
values was observed also when control (0.049, p=0.05, 
n=22) and insulin resistant obese and/or hypertensive (0.48, 
p=0.01, n=27) subjects were considered separately). 

These data demonstrate that insulin inftision is able to 
increase plasma PC-1 concentration in the most insulin 15 
sensitive subjects and that this effect is blunted in subjects 
with lower insulin sensitivity. 
Plasma v. Muscle Tissue PC-1 

in order to verify whether plasma PC-1 concentration was 
related to PC-1 content in skeletal muscle, we quantified 20 
PC-1 in both plasma and biopsied external oblique muscle of 
9 additional subjects. PC-1 concentration in plasma was 
inversely correlated with PC-1 content in muscle (-0.9, 



185 Bl 

24 

p=0.01). These data are compatible with the possibility that 
the increased PC-1 content previously reported in skeletal 
muscle of insulin resistant subjects is, at least in part, due to 
reduction of PC-1 intracellular degradation, and its subse- 
quent release into extracellular fluids, at the level of skeletal 
muscle tissue. 

EXAMPLE 3 

Intron/Exon Structure of PC-1 

llie nucleic acid sequences provided below are the intron- 
exon boundaries for the human PC-1 gene. It contains all the 
intron sequences immediately flanking the PC-1 exons. A 
few bases of the exon 5' and 3' are also provided, which are 
separated by a "-" sign from the intron, and are further in 
bold type. 

ITie 3' flanking sequence to exon 2 (i.e. intron 3 at its 5' 
end) contains a GT repeat that is polymorphic, and provides 
a marker for genotyping of this locus. The sequences flank- 
ing the boundaries or crossing them are useful for specific 
amplification of the exons. 



Intron Exon borders 
[SEQ ID NO: 11] Exon 1 

CTCTCGCTO-GTAGGTCCGCGGCCAGGCCCCGGCGCCCGGGAGGGCTGGGAATAC 



NGGGAGGGCGGCGCCGAGCTCCTGCGCTCTCAGCGCACTCAGCACCGGGCACGGA 
[SEQ ID KG: 12] Exon 2 

TGAGCTCCACCGGGCCGGCGGCCGCTCTAGAACTAGTGGATCATGCCACTGTACCCTAGCCTGGGTAACAGAGTA 

AGACACTATCTCTAAAAATAAAAAATAAGATAAAATATTTTTTAAAAAAGAAACCATGTAATTTTCTCTTTTCTC 
CCTACAG-GTATTG ... . AGAAC-GTAATTAGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGT 



GTGTGCACAGCCTTATTAAGAATGTGATTGAGGTAAACATTATCTCCTATTCCCAAGGGGTAC 
[SEQ ID NO: 13] Exon 3 

AGATTTTTGCCTTACTTTATTACCCCATCTGTATTTTGTAAAGTAGTATTTGAACCTAGTGTACACCTAACTTAG 
TTGTATTCGTTGATGTTTACTTTGAATTATATAATGATTAGAAACATCTGACTTATCGTTCAATTTTTTCAG- 



TTAA CCAG-GTAAG 



GATGAGCAGGGAAAAAAGTGGAGTTATGGTCATTAGGAAAAGATCCACTAGTTCTAGAGCGGCCGCCACGCCCGG 



TGGAGCTT 



[SEQ ZD NO: 14] Exon 4 

CGCGGCGGCCGTTCTAGAACTAGTGGATCATACTCACGAAGACAGCAATTCTGTGTTCACTTTGGACATGTTGAA 



TTTGAGACATAAAACACATTTTGCTGATGTTTGTTTCTAG-AaCATA. . .GTCAAG- 



GTCAGGTGCTCGTTGGGCTCTGCAGCAACCTGGT 



ATCTTCCAACCTCTTAACGGGGCTNTACATAAGTGTTATCTTTTATATTAAGANTCATGGCTATTGGGCC 
[SEQ ID NO: 15] Exon 5 

AATCTGTTCACATACTTTGTTTGTGGAATCTGTCTTAATGTGTCTCACAAGCATCACAATTATTAT^ 



GTGTGTTCATTTTATTTTCTTGAAAATATTTTAG-GT OA6AA GCAGO- 

GTAAGATTATATTCTGAGGTATTAATTTTTTCTTTTTT 

AGAAGTACAGCATCATTTTTTTCTTTCCAAATTAAGATGATAAAAATAATAAAATCACTGGTTTATTAAACATTA 



CAGGTTGAGTATCCTTTATCCAAAATGTTTGGTATGAGAACTGTrTTGGATTTTGGACTTTTTTGGATTTTGCAA 



TATT 



[SEQ ID NO: 16] Exon 6 

CCGCAGCCCGGGGGATCACACAGACCTTAGTGGAAAATCTTCACTGGACCTGTGCCAAGAAGGGGGTACATCTTC 
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-continued 

GTGAGTAACTTCAGAG 

TTTACTGCTGGAATATCACCATTTCAGTGAGATTGACTAGGCAGGCAGTCTTTCTTGGAAAAGTACTGGCAGAAC 
CTAACTGTTTCACTAAACTTTTCTAATGGGCAAAGTAGTTGAACCTTGTGTAgGGCGCCTTATCTTTAATAATGT 
GA 

[SEQ ID NO: 17] Exon 7 

TAAGAGAAAAATGAAGTCATCTTTAAGATTGGATTTGTATCCACAGTGTTGCTTTATAATTCATCCTGAATTTTT 
ATCTGATTAAAATCCCTCCTGGGTAATTTTTTTTACGTGATTTAGACTGCTGTGGTACCACTGCTAAATGAGGTA 

AGCCAATTGTCAGATGTATTTAATAACAATGTTTATTTTTTTCCCTTCTAG-AAAAATGT TCACC- 

GTAAGCTCTGCATTTCAACTTCTATCTGTTTGAAGAAGTGAGATGGGATTGTAACATTTTTTGAGGGAATAGAT^ 
TAAGATAAAAGAAAAACAACTTATTTTCCAATAGGTAGTTAAGTAAGGAAACCCAGGTTCTGATCTTTGCTCTGC 
CACAAACTAGCTGTGGCT 
[SEQ ZD NO: 18] Exon 8 

ACTACATAAAATCTTAAGAGGTTGCGTTTTGCCATTACCTGATTTTTTTGTTTTTCTT 

ATTCCy^TGTAGCTTCAGTTATCGGTTTCTTTTTGATGATTTTTTTCTGTGAATGTATTTAACATTAAGT 

AACTTGCATATAATCTGT 

TTTATCTTTTTTAG-GGATT AACCA-GTGAGTTCTTTGTTTTTCTACTAA 

AATAGTTAATTATTCTCATCTATTTCAATCAGAGTAAAATAACCAGATTCTCTAGAGCTTTTAATAACTGATTTC 

ATTTAGTGTGTCTGTGGCCAT 

[SEQ ID NO: 19] Exon 9 

TAATCTCTGACTATTTAATATGTTGTTGCTGCTTAAGAGTCATATTACATGATTATTGTCGTCTAAGTGCTGAAG 
CTTGTTGACCTTAAAAGCATTCTAGCACTAGAGAGGAATGCATTGGTGTGGTATGAAAACATACTTTCCTAAG 

ATGAATGTTGCATGATTTCTTAATTTTCCTTCATTTTCTGCTCCAG-ATTTGG AATGC-GTATGTG 

AAATGAATTTTTTCTAGGATCTGTAATATAGAACAGCTTATTCTTATGTAATGTGGTTTTTATTGAATC 
TTTAGCATTTGAGTGATATGTTGGCTGAAAAATGAGAACTGAAGAACTCTTTCTCAAAGAGTTTAGATAGATGGT 
AAATGGACAGTAAAACTA 
[SEQ ID NO: 20] Exon 10 

GGGAAAATAAAGTTTTCAAATAAAACCCTTGATTTCAAACACAATAGATGCGAAATAGCATTTACTAGCTCTO 

TGACATTTTCAATGAAAAAAACTATATTTTACACCCAAACAATTGTCAGCCATCTT'nATTTTTGTTTGTTCTTC 

ATTTTAG-TTCAGTA. . . .AGATGAAAG-GTCTGTAGGCAATTAATTTCTATTGTAAATACTTCGTTTTGTA 

GAAATGATATACTATTTTCCCCTAGACTACAACAAAACTTTGCTATTTGCIATGATGTTTTATATCGAAATAAAT 

TCTTTAGTAAATGATC 

[SEQ ID NO: 21] Exon 11 

GAATTTCAAAGCTGTAAATTAATTTCTCAGTAGAACTGTTACACCAGTGTTATAAAATTAATCCCTATCAATTGA 
GGAATTATTTTTTCCATTCTGTTTTTCAATGTGTTCGTAAAATATTACATTTTGATACTGTTTGATTT^ 
ACCACA.. .CAGIGA[1 -GTAA 

GTACATTTTTCTCAGTAATTATTTCATTAAACCCAGTCATCGGGCTGAACCTCGCTTTGAAGGAGGCTC^ 
CATTTTATAAGATTCTATCATTTCTGGAAAAAGCAAGTATTATACACAATATTACTAAATATAAGGATGCACTTT 
AAACAAAATAAGAGTTGG 
[SEQ ID NO: 2 2] Exon 12 

6TCTTAGTTTAATGTGAATCAGCTCATTGTAGTTGCATCCACTGGCCCAAATCTATCAATCTGTCGGTCTTTCTT 



GTATCACATGAGGTTGTGCTTCCCATTCTT AG-GTCATC ATCATG-GTAATCTGAATTTGCATTA 
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-continued 

TTTACTCTTCAGGATAAAGGGCTGAAGJ^GTTTACTTGATGGTTTCCCAATTTTTTGTGAATGTTGTA 
TCTTTTTTAAAAATGTAGTTTCTTATGGACAGTCTTTAGGAAAAAAATACATTAAATATAAAATATAAGTGAAAC 
ACAGAATTCACAGAAACC 
[SBQ ID NO: 2 3] Exon 13 

GATTTTGAAAAAAGTGAAGT6ATAGGTACAGCTGAAATTCTGTCTTACCTATCA6ATCTTCAACTAATATGAGTG 
CTACACCCATGTIITAACGAATTTAACCTTGGAAGTGAAAGAAGTTCTGCTCTGCATATTAAATO 

TTACAGCATGTTTTGGGATTTTTTTTTTCTCCTAG-GCATGG TACTATTCAT-GTAAGTATATCTC 

TGTGATAACTTTGAATATGGTCATATTAAGAATACCTTCCTTTAGGCCGGGCACAGTGGCTCATGCCTGTAATCG 
CAGCACTTTGGGAGGCCAAAGTGGGTGGTCACCTGAGGTCAGGAGTTCGAGACCAGCCTGGCCAACATGGTGAAA 
CCCTAAAAATACATATAC 
[SEQ ID NO: 2 4] Exon 14 

GATCCAAACTCTGCATTTAAATACCAAGGCAGGTTTTAAAGAGTTCATTTAAGTCATTACATTGTAGCCACTGAA 

AGGAATTAGACAGACCTTTAGGGATCTGACATTCTATATTTTTGTATTATGTTTTAATATAGTATACAATCAAAC 

TATTAATTCTTATGTTT6TTCCX:CTCCAG-TTAACPATGAAG6CATTGCCCGAAArCTTTCT-GTC 

TCTTTATTTTCCATTATCTAGTTATTTTTACTTTTGTATAATATATATTGAGAGAAAAGTTTCAGCATCTATTAT 

TGGGATT6AAGGATTAGAATATTTTAGTAATCTGGGCCAACATGGAAATGCTGTGTAGTTTAAAGATC 

[SEQ ID NO: 25] Exon 15 

CTGATGAAATGTTTGTGAAAAAAAATTTCATATGAAGTTAGAAAGCAATTTCAAGAAAAGTTGACACTTTTTATA 
GATATTAGGGAAATATCTTTCCCTAATAAATATCTTTCCCTAAAAAAGTTGACACTTTTTTAGATATTAGGGAAA 

TAATAGTTTTTCTTTGCT 

GTrTGCAATTTCAG-TGCCGGG GCATT-GTAAGTTCTGACAGTCTCCCAG 

GTAAACTTAGTCTGATCGGTTAGTGATTCAGGGTAACCATTGGGCCCTOTCTAACAATATTGTTATGTC 
GTATAAGTATGATTCTCTTCACTCTAACCCAGGATTTCTAATGTCGGCCTATGGATGTTTGAGTTAGATAAOT 
TTGTTGTGGAGAGCTGTC 
[SBQ ID NO: 2 6] Exon 16 

AAAAGATAGAGGTGACTTCTTAATGCTTTTCAAAGCCAGGTGGTTTTATTTACCGTTGTGTTGGTTTAACAAAAT 
AGTTACATACTTTTTAATCAATGAAAATAATGTTATG 

ATTATCAATTATGTTTTATGAAAGGACTTTACATTTTTAATTCATATATGTCAA^ . . .GCAA- 

TCTAAAGAAAAAATGATATGCAAAGTTTTAGACTTGAAAACATACTGTGATTATATGTCTTGAATGAGAATTAAT 

GGAACATACTTTCATAAAGCTATTTTTCTTTGAACATIAAAGAATTTTGTTAAAGTTTTATATTCATTGGCTATT 

ACTAAAAAGTCAAAAAAC 

[SEQ ID NO: 27] Exon 17 

AAAACTAAGAGACCTATCCTAGATGTCCTTAGATTATGTGTGTGATAGGGTTAAAACTATATTTCCCACAAAGTC 
CACTGAGCGTGGTAGTTTTCCTCTTATCTTATCATAACCAGTTTGTATATGTACAATGTGGATAACAGAATTTTT 

GGGACCAACTTGTAGACAGCTGAAATGCACTGATAAACTTCCTTTTCTGGCCATCTAG-GCCCT GTG 

T6-GTAAGTGTGAACAGGTGCCTTTTTTCCCTTCTGAAAATAGACCTGAAATAGGA 

TTATCAAAAGCAGGTCACATTGTAGGCAACTTTGTGGAGATGATGGTGAGGCAAGACAGATTTTTACCTTC 

TGACTCTCAGACTCACTGAAGAAATGTGGGGAACATG 

[SEQ ID NO: 2 8] Exon 18 

CATATCAGTATTTCTATTAAAAATAACCTAGTCTTAAATACTCTAAAACCCAAGAGAGTTTTATACTTTTATTTT 
AGTTAAAGAGTAAATGACTCATGTATTTGGTTTTAAAAAAGTAAAGATCATGGCACAAGTCTACTATTTGTTTGA 

TTTGAAACATCTAAGTAACTCTACC ATCTTGAAATTATGCAG-ATTTA CTTCG-GTAAGTATCGTCAA 

GAAGTTTGGTCCAGTATGTATGGTTTGATAGCACCCTCTGCATAGCATGTGCTGTAAAAATACTTAATAATCAAA 
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TTAgJ^TTTAGGAGTGGGGGTAGGTAAACATATGTTTTAATTCTAGGGGGCGCATGTAJ^TCTTTTGTC 

CTTTTCTCTTTCTAGTTT 

[8EQ ID NO: 2 9] Exon 19 

GTGAAAGAGCAACACTCTTGCCTTGAAAGAGAAAAAAAAATCCACTAATACAAGACTATCATAAATGATCTTTGT 
TTTATGTTGGAATAATCAATCTATAGCGGTCTATGTTACy^AAATTTAAAACATGTCTCTCAGTCCTTACA^^ 

TTTTATAACCTTTTTTCAG-ATTTTOCC OAAO-GTAAGGCATGCTACACACTCAAGCTCG6AATGTG 

AAGCAGGCATTTTCTCATCAGTGTGAAATGCAGAGAACTGGCTTGGGGGTATTATTTGAGAATAACCAATAAAAT 

AAAGGGAGTTCTGGAGGACCACCTGATGAAACATAGAGGTTTCTTTGCT 

[SEQ ID NO: 30] Exon 20 

GTCTTCTTAATTGTTTATGCTTGTACCCTTTGTAATCAGTTTTTTTAATAGTTAAAAGTAAATCTTC^ 

TAAGTAGAGGAAAGGATTAGATGAGTGTATCACACTATATATTATCATATAATGCACACTAACTACATTTATTTT 

CATCCTGTGACCCAAG-A GAAGATTA. . , . GACAGAAAT-GCAAGTATTTGTCACCTCTTTATGTGTGGCC 

ATTTCAAATTAATGATTAAGCAGAACATTAAATGCATAGTTTCTCACTGTTCACCT^ 

CGCATTAGAGGAACACTGAAGAGGGAGTCAGAAAAAT 

[SEQ ID HO: 31] Exon 21 

TTTAATATTGTAAAGCATTTTTACACTTTAGTTAGAAAAAAAGATGAATATACTAGTAGGAAAATAGGGAAGGAC 
ATGAGCTGACAGCTAGAGCTTGATAATTTTATGATGia^GTTCACCTTTAAATATTAATAAAGCAATTTTCT 

TGTGCCTGATATCTGAGAGTTCTTCTCATTTTCGTTCTTCAG-GACA CCACCAC-GTAAGTTTTTTCC 

TCTCCTGACCTTCCCTTTTCTCCTTTTTGTTTTCTTTCTTGTTTATAAATCCTACCATACATTATAGGGTAATAT 

ATATATTACCTATTATATATATATAGCTATATATATATACCTTTGTTTATTTATTGTGA 

[SEQ ID NO: 3 2] Exon 22 

CTCATCTTGAAAAGACTTCTTAAATATTTTATTTTTGTAAAGGACTTGACCAAACACATAACATTO 
CCTGTACTTGGGAAAGTTTTACAGGTTTAAGATG<?rACTCAGCTAATTTTTAAAAATGCTCCCCTAACCATGAGA 

AAGT ATAATTTCCTATGTTATTTGTGAAGAATGAAAAAGTTGTCCTCTTTTCTCTTTGTAG- AACTA TT 

CAAG-GTAAATAATGTTAACTCTATATTTGATAATTTTAATGAATTTGTGCACAT 

ATAGGCATAATTCATATGTATAGGACTTATGGTCTAAATTAAATGAATTAATACCAAATACATTCTTAAAGGTTT 

AACTTTGAGAATACTAGTACACAAAAATTCTAC 

[SEQ ID NO: 3 3] Exon 23 

CTGGGTGATATAGCACGACTCTGTCTCTAAACAAAAAACAAAACAAAACGAAGACTGAAGCCAAACTTGA^ 
TCTTTATTTACTATAAATGCTAATTTTGAATCATGGTGTTAATTTATTTCACACGTCAACATGGTCCCTTGTTCT 

TTTGAAACTACACTGGCTTCTATCTTGTTTCAG-TTATA GAOOCA-GTAAGAACATATTTCATTACTC 

TTAAAAATAGGAATTACCATCCAGTAGAT^TGGGATTACCATCCAGTTGAGTCAAGAGAACCTTTTTTA 
GTCGTATGTTTATGTGTATGACACTTCTGACTACACAGGAAGCCTCTTGAAATATCTCATTAATTTTGATGTtTT 
GCTCAATGTTCAGTAAAA 
[SEQ ID NO: 34] Exon 24 

GTTCTTATATTTAATTATTGGTTGGAATTTGATTTTTATATGTATTAAAAGCATGCTCTACTGAAATATTCATCA 

AAAGGAAGATAGTTATTTCTTTCTTAAAATGAATATTGGC ATGTTTTACAG-AAAAA TGTGTC- 

GTAAGTAGCTTTTGTATATTTAC 

TTTGCATGTTGAAAATCTAGACATATGCATATTTGTTTATGTCACCCATCTGACATTACAGTGAGAGAAAGCAC^ 
ACTGAGTACACATGGACTTCGAAATTATAGGATGCTTTTAAATTTGATCTTTTAAGATGACATATCTTTGGGGAA 
GACTACCCTGTCTGCTTT 
[SEQ ID NO: 3 5] Exon 25 

AATTAAACAAACATGCATGGTATGTATTAGAAGGAAAGCTACTCAAGAGGAGAGATGATGCCTAACAAATCATGT 



31 
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-continued 

GGCACGTTCCACTTCAGAGCTGAAATCTCGTAAATGATTAAACTGGGGAGATGGAGCACTTATAGAAGTGAACTG 
AGTGTTCTCTTGGTAACTTTTCTTTTATATTTCCTATTCTCCTAG-CATGG . . . . ATTAA-AAAAGAAAAA 
TATTCCTATCCTGCTCACTGGTAATTAACATAGGTTTAAAATGGCTTCAAATGTGGCCCTATAGACGGTTAAAAT 
TGTACCTTATCTTGGCAAAACTTCAGAGCACCAGTCAGTGCATGCAAGGTGCCATTTTTTATTGAGATGCTTAGA 
ATGTTTCTTTCTGTGCAC 



It is to be understood that this invention is not limited to 
the particular methodology, protocols, formulations and 
reagents described, as such may, of course, vary. It is also to 
be understood that the terminology used herein is for the 
purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention 
which will be limited only by the appended claims. 

It must be noted that as used herein and in the appended 
claims, the singular forms "a", "and", and "the" include 20 
plural referents unless the context clearly dictates otherwise. 
Thus, for example, reference to "a complex" includes a 
plurality of such complexes and reference to "the formula- 
tion" includes reference to one or more formulations and 
equivalents thereof known to those skilled in the art, and so 25 
forth. 

Unless defined otherwise, all technical and scientific 
terms used herein have the same meaning as commonly 



understood to one of ordinary skill in the art to which this 
invention belongs. Although any methods, devices and 
materials similar or equivalent to those described herein can 
be used in the practice or testing of the invention, the 
preferred methods, devices and materials are now described. 

All publications mentioned herein are incorporated herein 
by reference for the purpose of describing and disclosing, for 
example, the cell lines, constructs, and methodologies that 
are described in the publications which might be used in 
comiection with the presently described invention. The 
publications discussed above and throughout the text are 
provided solely for their disclosure prior to the filing date of 
the present application. Nothing herein is to be construed as 
an admission that the inventors are not entitled to antedate 
such disclosure by virtue of prior invention. 



SEQUENCE LISTING 



<160> NUMBER OF SEQ ID NOS: 35 

<210> SEQ ID NO 1 

<211> LENGTH: 3486 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME /KEY: CDS 

<222> LOCATION: ( 16 4 ) . . . ( 27 85 ) 

<400> SEQUENCE: 1 

ggccacgatg gagcgcgacg gctgcgcggg gggcgggagc cgcggcggcg agggcgggcg 60 • 

cgctccccgg gagggcccgg cggggaacgg ccgcgatcgg ggccgcagcc acgctgccga 120 

ggcgcccggg gacccgcagg cggccgcgtc cttgctggcc cct atg gac gtg ggg 175 

Met Asp Val Gly 

1 

gag gag ccg ctg gag aag gcg gcg cgc gcc cgc act gcc aag gac ccc 22 3 

Glu Glu Pro Leu Glu Lys Ala Ala Arg Ala Arg Thr Ala Lys Asp Pro 
5 10 15 20 

aac acc tat aaa gta etc teg ctg gta ttg tea gta tgt gtg tta aca 271 
Asn Thr Tyr Lys Val Leu Ser Leu Val Leu Ser Val Cys Val Leu Thr 
25 30 35 

aca ata ctt ggt tgt ata ttt ggg ttg aaa cca age tgt gcc aaa gaa 319 

Thr lie Leu Gly Cys lie Phe Gly Leu Lys Pro Ser Cys Ala Lys Glu 
40 45 50 

gtt aaa agt tgc aaa ggt cgc tgt ttc gag aga aca ttt ggg aac tgt 367 
Val Lys Ser Cys Lys Gly Arg Cys Phe Glu Arg Thr Phe Gly Asn Cys 
55 60 65 

cgc tgt gat get gcc tgt gtt gag ctt gga aac tgc tgt tta gat tac 415 
Arg Cys Asp Ala Ala Cys Val Glu Leu Gly Asn Cys Cys Leu Asp Tyr 
70 75 80 
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cag gag acg tgc ata gaa cca gaa cat ata tgg act tgc aac aaa ttc 463 
Gin Glu Thr Cys lie Glu Pro Glu Hie lie Trp Thr Cys Asn Lye Phe 
85 90 95 100 

agg tgt ggt gag aaa agg ttg acc aga age etc tgt gcc tgt tea gat 511 
Arg Cys Gly Glu hys Arg Leu Thr Arg Ser Leu Cys Ala Cys Ser Asp 
105 110 115 

gac tgc aag gac aag ggc gac tgc tgc ate aac tac agt tct gtg tgt 559 
Asp Cys Lys Asp Lys Gly Asp Cys Cys lie Asn Tyr Ser Ser Val Cys 
120 125 130 

caa ggt gag aaa agt tgg gta gaa gaa cca tgt gag age att aat gag 607 
Gin Gly Glu Lys Ser Trp Val Glu Glu Pro Cys Glu Ser lie Asn Glu 
135 140 145 

cca cag tgc cca gca ggg ttt gaa acg cct cct acc etc tta ttt tct 655 
Pro Gin Cys Pro Ala Gly Phe Glu Thr Pro Pro Thr Leu Leu Phe Ser 

150 155 160 

ttg gat gga ttc agg gea gaa tat tta eae act tgg ggt gga ett ctt 703 
Leu Asp Gly Phe Arg Ala Glu Tyr Leu His Thr Trp Gly Gly Leu Leu 
165 170 175 180 

cct gtt att age aaa eta aaa aaa tgt gga aca tat act aaa aac atg 751 
Pro Val lie Ser Lys Leu Lys Lys Cys Gly Thr Tyr Thr Lys Asn Met 
185 190 195 

aga ccg gta tat cca aca aaa act ttc ccc aat cac tac age att gtc 799 
Arg Pro Val Tyr Pro Thr Lys Thr Phe Pro Asn His Tyr Ser He Val 

200 205 210 

acc gga ttg tat cca gaa tet eat gge ata ate gac aat aaa atg tat 84 7 

Thr Gly Leu Tyr Pro Glu Ser His Gly He He Asp Asn Lys Met Tyr 
215 220 225 

gat ccc aaa atg aat get tec ttt tea ett aaa agt aaa gag aaa ttt 895 
Asp Pro Lys Met Asn Ala Ser Phe Ser Leu Lys Ser Lys Glu Lys Phe 
230 235 240 

aat cct gag tgg tac aaa gga gaa cca att tgg gtc aca get aag tat 943 
Asn Pro Glu Trp Tyr Lys Gly Glu Pro He Trp Val Thr Ala Lys Tyr 

245 250 255 260 

caa gge etc aag tct ggc aca ttt ttc tgg cca gga tea gat gtg gaa 991 
Gin Gly Leu Lys Ser Gly Thr Phe Phe Trp Pro Gly Ser Asp Val Glu 
265 270 275 

att aac gga att ttc cca gac ate tat aaa atg tat aat ggt tea gta 1039 
lie Asn Gly He Phe Pro Asp He Tyr Lys Met Tyr Asn Gly Ser Val 
280 285 290 

cca ttt gaa gaa agg att tta get gtt ctt cag tgg eta cag ctt cct 1087 
Pro Phe Glu Glu Arg He Leu Ala Val Leu Gin Trp Leu Gin Leu Pro 

295 300 305 

aaa gat gaa aga eea eae ttt tac act etg tat tta gaa gaa cca gat 1135 
Lys Asp Glu Arg Pro His Phe Tyr Thr Leu Tyr Leu Glu Glu Pro Asp 
310 315 320 

tct tea ggt eat tea tat gga cca gtc age agt gaa gtc ate aaa gee 1183 
Ser Ser Gly His Ser Tyr Gly Pro Val Ser Ser Glu Val He Lys Ala 
325 330 335 340 

ttg cag agg gtt gat ggt atg gtt ggt atg ctg atg gat ggt ctg aaa 1231 
Leu Gin Arg Val Asp Gly Met Val Gly Met Leu Met Asp Gly Leu Lys 

345 350 355 

gag ctg aac ttg eae aga tgc etg aac etc ate ett att tea gat cat 1279 
Glu Leu Asn Leu His Arg Cys Leu Asn Leu He Leu He Ser Asp His 
360 365 370 

ggc atg gaa caa ggc agt tgt aag aaa tac ata tat etg aat aaa tat 1327 
Gly Met Glu Gin Gly Ser Cys Lys Lys Tyr He Tyr Leu Asn Lys Tyr 
375 380 385 

ttg ggg gat gtt aaa aat att aaa gtt ate tat gga cct gca get ega 1375 
Leu Gly Asp Val Lys Asn He Lys Val He Tyr Gly Pro Ala Ala Arg 
390 395 400 
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ttg aga ccc tct gat gtc cca gat aaa tac tat tea ttt aac tat gaa 1423 
Leu Arg Pro Ser Asp Val Pro Asp Lys Tyr Tyr Ser Phe Asn Tyr Glu 
405 410 415 420 

ggc att gcc cga aat ctt tct tgc egg gaa eca aac eag cac ttc aaa 1471 
Gly lie Ala Arg Asn Leu Ser Cys Arg Glu Pro Asn Gin His Phe Lys 
425 430 435 

cct tac etg aaa cat ttc tta cct aag cgt ttg cac ttt get aag agt 1519 
Pro Tyr Leu Lys His Phe Leu Pro Lys Arg Leu His Phe Ala Lys Ser 
440 445 450 

gat aga att gag ccc ttg aca ttc tat ttg gac cct cag tgg caa ctt 1567 
Asp Arg lie Glu Pro Leu Thr Phe Tyr Leu Asp Pro Gin Trp Gin Leu 
455 460 465 

gca ttg aat ccc tea gaa agg aaa tat tgt gga agt gga ttt eat ggc 1615 
Ala Leu Asn Pro Ser Glu Arg Lys Tyr Cys Gly Ser Gly Phe His Gly 
470 475 480 

tct gac aat gta ttt tea aat atg caa gcc etc ttt gtt ggc tat gga 1663 
Ser Asp Asn Val Phe Ser Asn Met Gin Ala Leu Phe Val Gly Tyr Gly 

485 490 495 500 

cct gga ttc aag cat ggc att gag get gac acc ttt gaa aac att gaa 1711 
Pro Gly Phe Lys His Gly lie Glu Ala Asp Thr Phe Glu Asn He Glu 
505 510 515 

gtc tat aac tta atg tgt gat tta ctg aat ttg aca ccg get cct aat 1759 
Val Tyr Asn Leu Met Cys Asp Leu Leu Asn Leu Thr Pro Ala Pro Asn 
520 525 530 

aac gga act cat gga agt ctt aac cac ctt eta aag aat cct gtt tat 1807 
Asn Gly Thr His Gly Ser Leu Asn His Leu Leu Lys Asn Pro Val Tyr 

535 540 545 

acg eca aag cat ccc aaa gaa gtg cac ccc ctg gta cag tgc ccc ttc 1855 
Thr Pro Lys His Pro Lys Glu Val His Pro Leu Val Gin Cys Pro Phe 
550 555 560 

aca aga aac ccc aga gat aac ctt ggc tgc tea tgt aac cct teg att 1903 
Thr Arg Asn Pro Arg Asp Asn Leu Gly Cys Ser Cys Asn Pro Ser He 
565 570 575 580 

ttg ccg att gag gat ttt caa aca cag ttc aat ctg act gtg gca gaa 1951 
Leu Pro He Glu Asp Phe Gin Thr Gin Phe Asn Leu Thr Val Ala Glu 
585 590 595 

gag aag att att aag cat gaa act tta ccc tat gga aga cct aga gtt 1999 
Glu Lys He He Lys His Glu Thr Leu Pro Tyr Gly Arg Pro Arg Val 
600 605 610 

etc cag aag gaa aac acc ate tgt ctt ctt tec cag cac cag ttt atg 2047 
Leu Gin Lys Glu Asn Thr He Cys Leu Leu Ser Gin His Gin Phe Met 
615 620 625 

agt gga tac age caa gac ate tta atg ccc ctt tgg aca tee tat acc 2095 
Ser Gly Tyr Ser Gin Asp He Leu Met Pro Leu Trp Thr Ser Tyr Thr 
630 635 640 

gtg gac aga aat gac agt ttc tct acg gaa gac ttc tec aac tgt etg 214 3 

Val Asp Arg Asn Asp Ser Phe Ser Thr Glu Asp Phe Ser Asn Cys Leu 
645 650 655 660 

tac eag gac ttt aga att cct ctt agt cct gtc cat aaa tgt tea ttt 2191 
Tyr Gin Asp Phe Arg He Pro Leu Ser Pro Val His Lys Cys Ser Phe 
665 670 675 

tat aaa aat aac acc aaa gtg agt tac ggg ttc etc tec cca cca caa 2239 
Tyr Lys Asn Asn Thr Lys Val Ser Tyr Gly Phe Leu Ser Pro Pro Gin 

680 685 690 

eta aat aaa aat tea agt gga ata tat tct gaa get ttg ctt act aca 2287 
Leu Asn Lys Asn Ser Ser Gly He Tyr Ser Glu Ala Leu Leu Thr Thr 
695 700 705 

aat ata gtg cca atg tac cag agt ttt caa gtt ata tgg cgc tac ttt 2335 
Asn He Val Pro Met Tyr Gin Ser Phe Gin Val He Trp Arg Tyr Phe 
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710 



715 



720 



cat gac acc eta ctg cga aag tat get gaa gaa aga aat ggt gtc aat 
His Asp Thr Leu Leu Arg Lys Tyr Ala Glu Glu Arg Asn Gly Val Asn 
725 730 735 740 



2383 



gtc gtc agt ggt cot gtg ttt gac ttt gat tat gat ggo cgt tgt gat 
Val Val Ser Gly Pro Val Phe Asp Phe Asp Tyr Asp Gly Arg Cys Asp 

745 750 755 



tec tta gag aat ctg agg caa aaa aga aga gtc ate cgt aac caa gaa 
Ser Leu Glu Asn Leu Arg Gin Lys Arg Arg Val lie Arg Asn Gin Glu 
760 765 770 



2479 



att ttg att cea act eac ttc ttt att gtg eta aea age tgt aaa gat 
lie Leu lie Pro Thr His Phe Phe lie Val Leu Thr Ser Cys Lys Asp 
775 780 785 



aca tct cag acg cct ttg cac tgt gaa aac eta gac acc tta get ttc 
Thr Ser Gin Thr Pro Leu His Cys Glu Asn Leu Asp Thr Leu Ala Phe 
790 795 800 



2575 



att ttg cct cac agg act gat aae age gag age tgt gtg cat ggg aag 

lie Leu Pro His Arg Thr Asp Asn Ser Glu Ser Cys Val His Gly Lys 
805 810 815 820 

cat gac tec tea tgg gtt gaa gaa ttg tta atg tta cac aga gca egg 

His Asp Ser Ser Trp Val Glu Glu Leu Leu Met Leu His Arg Ala Arg 

825 830 835 

ate aca gat gtt gag cac ate act gga etc age ttc tat caa caa aga 

He Thr Asp Val Glu His He Thr Gly Leu Ser Phe Tyr Gin Gin Arg 

840 845 850 

aaa gag cea gtt tea gac att tta aag ttg aaa aca eat ttg cea acc 

Lys Glu Pro Val Ser Asp He Leu Lys Leu Lys Thr His Leu Pro Thr 

855 860 865 



ttt age eaa gaa gac tga tatgtttttt atccccaaac aceatgaate 

Phe Ser Gin Glu Asp 
870 



2815 



tttttgagag 


aacettatat 


tttatatagt 


cctctagcta cactattgea ttgttcagaa 


2875 


actgtcgaec 


agagttagaa 


cggagccctc 


ggtgatgegg aeatctcagg gaaacttgcg 


2935 


tactcagcac 


agcagtggag 


agtgttcctg 


ttgaatcttg cac at at ttg aatgtgtaag 


2995 


cattgtatac 


attgatcaag 


ttegggggaa 


taaagacaga ccacacctaa aactgccttt 


3055 


ctgettctet 


taaaggagaa 


gtagctgtga 


acattgtctg gataecagat atttgaatet 


3115 


ttettactat 


tggtaataaa 


ccttgatggc 


attgggcaaa cagtagactt atagtagggt 


3175 


tggggtagcc 


catgttatgt 


gactatcttt 


atgagaattt taaagtggtt etggatatct 


3235 


tttaaettgg 


agt ttc at tt 


cttttcattg 


taatcaaaaa aaaaattaac agaagccaaa 


3295 


atacttctga 


gaecttgttt 


caatctttgc 


tgtatatcce etcaaaatcc aagttattaa 


3355 


tcttatgtgt 


tttcttttta 


attttttgat 


tggatttett tagatttaat ggttcaaatg 


3415 


agttcaactt 


tgagggaega 


tetttgaata 


tacttaccta ttataaaatc ttactttgta 


3475 


tttgtattta 


a 






3486 



<210> SEQ ID NO 2 

<211> LENGTH: 873 

<212> TYPE: PRT 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 2 



Met Asp Val Gly Glu Glu Pro Leu Glu Lys Ala Ala Arg Ala Arg Thr 
15 10 15 



Ala Lys Asp Pro Asn Thr Tyr Lys Val Leu Ser Leu Val Leu Ser Val 
20 25 30 
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eye Val Leu Thr Thr lie Leu Gly Cys lie Phe Gly Leu Lys Pro Ser 
35 40 45 

Cys Ala Lys Glu Val Lys Ser Cys Lys Gly Arg Cys Phe Glu Arg Thr 
50 55 60 

Phe Gly Aen Cys Arg Cys Asp Ala Ala Cys Val Glu Leu Gly Asn Cys 
65 70 75 80 

Cys Leu Asp Tyr Gin Glu Thr Cys He Glu Pro Glu His He Trp Thr 
85 90 95 

Cys Asn Lys Phe Arg Cys Gly Glu Lys Arg Leu Thr Arg Ser Leu Cys 
100 105 110 

Ala Cys Ser Asp Asp Cys Lys Asp Lys Gly Asp Cys Cys He Asn Tyr 
115 120 125 

Ser Ser Val Cys Gin Gly Glu Lys Ser Trp Val Glu Glu Pro Cys Glu 
130 135 140 

Ser He Asn Glu Pro Gin Cys Pro Ala Gly Phe Glu Thr Pro Pro Thr 

145 150 155 160 

Leu Leu Phe Ser Leu Asp Gly Phe Arg Ala Glu Tyr Leu His Thr Trp 
165 170 175 

Gly Gly Leu Leu Pro Val He Ser Lys Leu Lys Lys Cys Gly Thr Tyr 
180 185 190 

Thr Lys Asn Met Arg Pro Val Tyr Pro Thr Lys Thr Phe Pro Asn His 
195 200 205 

Tyr Ser He Val Thr Gly Leu Tyr Pro Glu Ser His Gly He He Asp 

210 215 220 

Asn Lys Met Tyr Asp Pro Lys Met Asn Ala Ser Phe Ser Leu Lys Ser 
225 230 235 240 

Lys Glu Lys Phe Asn Pro Glu Trp Tyr Lys Gly Glu Pro He Trp Val 
245 250 255 

Thr Ala Lys Tyr Gin Gly Leu Lys Ser Gly Thr Phe Phe Trp Pro Gly 
260 265 270 

Ser Asp Val Glu He Asn Gly He Phe Pro Asp He Tyr Lys Met Tyr 

275 • 280 285 

Asn Gly Ser Val Pro Phe Glu Glu Arg He Leu Ala Val Leu Gin Trp 
290 295 300 

.Leu Gin Leu Pro Lys Asp Glu Arg Pro His Phe Tyr Thr Leu Tyr Leu 
305 310 315 320 

Glu Glu Pro Asp Ser Ser Gly His Ser Tyr Gly Pro Val Ser Ser Glu 
325 330 335 

Val He Lys Ala Leu Gin Arg Val Asp Gly Met Val Gly Met Leu Met 

340 345 350 

Asp Gly Leu Lys Glu Leu Asn Leu His Arg Cys Leu Asn Leu He Leu 
355 360 365 

He Ser Asp His Gly Met Glu Gin Gly Ser Cys Lys Lys Tyr He Tyr 
370 375 380 

Leu Asn Lye Tyr Leu Gly Asp Val Lys Asn He Lys Val He Tyr Gly 
385 390 395 400 

Pro Ala Ala Arg Leu Arg Pro Ser Asp Val Pro Asp Lys Tyr Tyr Ser 

405 410 415 

Phe Asn Tyr Glu Gly He Ala Arg Asn Leu Ser Cys Arg Glu Pro Asn 
420 425 430 

Gin His Phe Lys Pro Tyr Leu Lys His Phe Leu Pro Lys Arg Leu His 
435 440 445 
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Fhe Ala Lys Ser Asp Arg lie Glu Pro Leu Thr Phe Tyr Leu Asp Pro 

450 455 460 

Gin Trp Gin Leu Ala Leu Asn Pro Ser Glu Arg Lys Tyr Cys Gly Ser 
465 470 475 480 

Gly Phe His Gly Ser Asp Asn Val Phe Ser Asn Met Gin Ala Leu Phe 
485 490 495 

Val Gly Tyr Gly Pro Gly Phe Lys His Gly He Glu Ala Asp Thr Phe 
500 505 510 

Glu Asn He Glu Val Tyr Asn Leu Met Cys Asp Leu Leu Asn Leu Thr 
515 520 525 

Pro Ala Pro Asn Asn Gly Thr His Gly Ser Leu Asn His Leu Leu Lys 
530 535 540 

Asn Pro Val Tyr Thr Pro Lys His Pro Lys Glu Val His Pro Leu Val 
545 550 555 560 

Gin Cys Pro Phe Thr Arg Asn Pro Arg Asp Asn Leu Gly Cys Ser Cys 
565 570 575 

Asn Pro Ser He Leu Pro He Glu Asp Phe Gin Thr Gin Phe Asn Leu 
580 585 590 

Thr Val Ala Glu Glu Lys He He Lys His Glu Thr Leu Pro Tyr Gly 
595 600 605 

Arg Pro Arg Val Leu Gin Lys Glu Asn Thr He Cys Leu Leu Ser Gin 
610 615 620 

His Gin Phe Met Ser Gly Tyr Ser Gin Asp He Leu Met Pro Leu Trp 
625 630 635 640 

Thr Ser Tyr Thr Val Asp Arg Asn Asp Ser Phe Ser Thr Glu Asp Phe 
645 650 655 

Ser Asn Cys Leu Tyr Gin Asp Phe Arg He Pro Leu Ser Pro Val His 

660 665 670 

Lys Cys Ser Phe Tyr Lys Asn Asn Thr Lys Val Ser Tyr Gly Phe Leu 
675 680 685 

Ser Pro Fro Gin Leu Asn Lys Asn Ser Ser Gly He Tyr Ser Glu Ala 
690 695 700 

Leu Leu Thr Thr Asn He Val Pro Met Tyr Gin Ser Phe Gin Val He 
705 710 715 720 

Trp Arg Tyr Phe His Asp Thr Leu Leu Arg Lys Tyr Ala Glu Glu Arg 
725 730 735 

Asn Gly Val Asn Val Val Ser Gly Pro Val Phe Asp Phe Asp Tyr Asp 
740 745 750 

Gly Arg Cys Asp Ser Leu Glu Asn Leu Arg Gin Lys Arg Arg Val He 
755 760 765 

Arg Asn Gin Glu He Leu He Pro Thr His Phe Phe He Val Leu Thr 
770 775 780 

Ser Cys Lys Asp Thr Ser Gin Thr Pro Leu His Cys Glu Asn Leu Asp 
785 790 795 800 

Thr Leu Ala Phe He Leu Pro His Arg Thr Asp Asn Ser Glu Ser Cys 
805 810 815 

Val His Gly Lys His Asp Ser Ser Trp Val Glu Glu Leu Leu Met Leu 
820 825 830 

His Arg Ala Arg He Thr Asp Val Glu His He Thr Gly Leu Ser Phe 
835 840 845 

Tyr Gin Gin Arg Lys Glu Pro Val Ser Asp He Leu Lys Leu Lys Thr 

850 855 860 

His Leu Pro Thr Phe Ser Gin Glu Asp 
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865 870 



<210> SEQ ID NO 3 

<211> LENGTH: 3486 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME /KEY: CDS 

<222> LOCATION: ( 164 )...( 2785 ) 

<400> SEQUENCE: 3 

ggccacgatg gagcgcgacg gctgcgcggg gggcgggagc cgcggcggcg agggcgggcg 60 

cgctccccgg gagggcccgg cggggaacgg ccgcgatcgg ggccgcagcc acgctgccga 120 

ggcgcccggg gacccgcagg cggccgcgtc cttgctggcc cct atg gac gtg ggg 175 

Met Asp Val Gly 
1 

gag gag ccg ctg gag aag gcg gcg cgc gcc cgc act gcc aag gac ccc 223 
Glu Glu Pro Leu Glu Lys Ala Ala Arg Ala Arg Thr Ala Lys Asp Pro 
5 10 15 20 

aac acc tat aaa gta etc teg etg gta ttg tea gta tgt gtg tta aca 271 
Asn Thr Tyr Lys Val Leu Ser Leu Val Leu Ser Val Cys Val Leu Thr 
25 30 35 

aca ata ctt ggt tgt ata ttt ggg ttg aaa cca age tgt gcc aaa gaa 319 
Thr lie Leu Gly Cys lie Phe Gly Leu Lys Pro Ser Cys Ala Lys Glu 
40 45 50 



gtt aaa agt tgc aaa ggt cgc tgt ttc gag aga aca ttt ggg aac tgt 367 
Val Lys Ser Cys Lys Gly Arg Cys Phe Glu Arg Thr Phe Gly Asn Cys 

55 60 65 

cgc tgt gat get gcc tgt gtt gag ctt gga aac tgc tgt tta gat tac 415 
Arg Cys Asp Ala Ala Cys Val Glu Leu Gly Asn Cys Cys Leu Asp Tyr 
70 75 80 

cag gag acg tgc ata gaa cca gaa cat ata tgg act tgc aac aaa ttc 463 
Gin Glu Thr Cys lie Glu Pro Glu His lie Trp Thr Cys Asn Lys Phe 
85 90 95 100 

agg tgt ggt gag aaa agg ttg acc aga age etc tgt gcc tgt tea gat 511 
Arg Cys Gly Glu Lys Arg Leu Thr Arg Ser Leu Cys Ala Cys Ser Asp 

105 110 115 

gac tgc aag gac cag ggc gac tgc tgc ate aac tac agt tct gtg tgt 559 
Asp Cys Lys Asp Gin Gly Asp Cys Cys He Asn Tyr Ser Ser Val Cys 
120 125 130 

caa ggt gag aaa agt tgg gta gaa gaa cca tgt gag age att aat gag 607 
Gin Gly Glu Lys Ser Trp Val Glu Glu Pro Cys Glu Ser He Asn Glu 
135 140 145 

cca cag tgc cca gca ggg ttt gaa acg cct cct acc etc tta ttt tct 655 
Pro Gin Cys Pro Ala Gly Phe Glu Thr Pro Pro Thr Leu Leu Phe Ser 

150 155 160 

ttg gat gga ttc agg gca gaa tat tta cae act tgg ggt gga ctt ctt 703 
Leu Asp Gly Phe Arg Ala Glu Tyr Leu His Thr Trp Gly Gly Leu Leu 
165 170 175 180 

cct gtt att age aaa eta aaa aaa tgt gga aca tat act aaa aac atg 751 
Pro Val He Ser Lys Leu Lys Lys Cys Gly Thr Tyr Thr Lys Asn Met 
185 190 195 

aga ccg gta tat cca aca aaa act ttc ccc aat cac tac age att gtc 799 
Arg Pro Val Tyr Pro Thr Lys Thr Phe Pro Asn His Tyr Ser He Val 

200 205 210 

acc gga ttg tat cca gaa tct cat ggc ata ate gac aat aaa atg tat 847 
Thr Gly Leu Tyr Pro Glu Ser His Gly He He Asp Asn Lys Met Tyr 
215 220 225 



gat ccc aaa atg aat get tec ttt tea ctt aaa agt aaa gag aaa ttt 
Asp Pro Lys Met Asn Ala Ser Phe Ser Leu Lys Ser Lys Glu Lys Phe 



895 
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230 235 240 

aat cct gag tgg tac aaa gga gaa cca att tgg gtc aca get aag tat 943 
Asn Pro Glu Trp Tyr Lys Gly Glu Pro He Trp Val Thr Ala Lys Tyr 
245 250 255 260 

caa ggc etc aag tct ggc aca ttt ttc tgg cca gga tea gat gtg gaa 991 
Gin Gly Leu Lys Ser Gly Thr Phe Phe Trp Pro Gly Ser Asp Val Glu 
265 270 275 

att aac gga att ttc cca gac ate tat aaa atg tat aat ggt tea gta 1039 
He Asn Gly He Phe Pro Asp He Tyr Lys Met Tyr Asn Gly Ser Val 
280 285 290 

cca ttt gaa gaa agg att tta get gtt ctt cag tgg eta cag ctt cct 1087 
Pro Phe Glu Glu Arg He Leu Ala Val Leu Gin Trp Leu Gin Leu Pro 
295 300 305 

aaa gat gaa aga cca cac ttt tac act ctg tat tta gaa gaa cca gat 1135 
Lys Asp Glu Arg Pro His Phe Tyr Thr Leu Tyr Leu Glu Glu Pro Asp 
310 315 320 

tct tea ggt cat tea tat gga cca gtc age agt gaa gtc ate aaa gee 1183 
Ser Ser Gly His Ser Tyr Gly Pro Val Ser Ser Glu Val He Lys Ala 
325 330 335 340 

ttg cag agg gtt gat ggt atg gtt ggt atg ctg atg gat ggt ctg aaa 1231 
Leu Gin Arg Val Asp Gly Met Val Gly Met Leu Met Asp Gly Leu Lys 
345 350 355 

gag ctg aac ttg cac aga tgc ctg aac etc ate ctt att tea gat cat 1279 
Glu Leu Asn Leu His Arg Cys Leu Asn Leu He Leu He Ser Asp His 
360 365 370 

ggc atg gaa caa ggc agt tgt aag aaa tac ata tat ctg aat aaa tat 1327 
Gly Met Glu Gin Gly Ser Cys Lys Lys Tyr He Tyr Leu Asn Lys Tyr 
375 380 385 

ttg ggg gat gtt aaa aat att aaa gtt ate tat gga cct gca get cga 1375 
Leu Gly Asp Val Lys Asn He Lys Val He Tyr Gly Pro Ala Ala Arg 
390 395 400 

ttg aga ccc tct gat gtc cca gat aaa tac tat tea ttt aac tat gaa 1423 
Leu Arg Pro Ser Asp Val Pro Asp Lys Tyr Tyr Ser Phe Asn Tyr Glu 
405 410 415 420 

ggc att gee cga aat ctt tct tgc egg gaa cca aac cag cac ttc aaa 1471 
Gly He Ala Arg Asn Leu Ser Cys Arg Glu Pro Asn Gin His Phe Lys 
425 430 435 

cct tac ctg aaa cat ttc tta cct aag cgt ttg cac ttt get aag agt 1519 
Pro Tyr Leu Lys His Phe Leu Pro Lys Arg Leu His Phe Ala Lys Ser 
440 445 450 

gat aga att gag ccc ttg aca ttc tat ttg gac cct cag tgg caa ctt 1567 
Asp Arg He Glu Pro Leu Thr Phe Tyr Leu Asp Pro Gin Trp Gin Leu 
455 460 465 

gca ttg aat ccc tea gaa agg aaa tat tgt gga agt gga ttt cat ggc 1615 
Ala Leu Asn Pro Ser Glu Arg Lys Tyr Cys Gly Ser Gly Phe His Gly 
470 475 480 

tct gac aat gta ttt tea aat atg caa gcc etc ttt gtt ggc tat gga 1663 
Ser Asp Asn Val Phe Ser Asn Met Gin Ala Leu Phe Val Gly Tyr Gly 
485 490 495 500 

cct gga ttc aag cat ggc att gag get gac acc ttt gaa aac att gaa 1711 
Pro Gly Phe Lys His Gly He Glu Ala Asp Thr Phe Glu Asn He Glu 
505 510 515 

gtc tat aac tta atg tgt gat tta ctg aat ttg aca ccg get cct aat 1759 
Val Tyr Asn Leu Met Cys Asp Leu Leu Asn Leu Thr Pro Ala Pro Asn 
520 525 530 

aac gga act cat gga agt ctt aac cac ctt eta aag aat cct gtt tat 1807 
Asn Gly Thr His Gly Ser Leu Asn His Leu Leu Lys Asn Pro Val Tyr 
535 540 545 

acg cca aag cat ccc aaa gaa gtg cac ccc ctg gta cag tgc ccc ttc 1855 
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Thr Pro Lys His Pro Lys Glu Val His Pro Leu Val Gin Cys Pro Phe 

550 555 560 

aca aga aac ccc aga gat aac ctt ggc tgc tea tgt aac cct teg att 1903 

Thr Arg Asn Pro Arg Asp Asn Leu Gly Cys Ser Cys Asn Pro Ser lie 

565 570 575 580 

ttg ccg att gag gat ttt caa aca cag ttc aat ctg act gtg gca gaa 1951 

Leu Pro He Glu Asp Phe Gin Thr Gin Phe Asn Leu Thr Val Ala Glu 
585 590 595 

gag aag att att aag cat gaa act tta ccc tat gga aga cct aga gtt 1999 

Glu Lys He He Lys His Glu Thr Leu Pro Tyr Gly Arg Pro Arg Val 
600 605 610 

etc cag aag gaa aac acc ate tgt ctt ctt tec cag cac cag ttt atg 2047 

Leu Gin Lys Glu Asn Thr He Cys Leu Leu Ser Gin His Gin Phe Met 
615 620 625 

agt gga tac age caa gac ate tta atg ccc ctt tgg aca tec tat acc 2095 

Ser Gly Tyr Ser Gin Asp He Leu Met Pro Leu Trp Thr Ser Tyr Thr 
630 635 640 

gtg gac aga aat gac agt ttc tct acg gaa gac ttc tec aac tgt ctg 2143 

Val Asp Arg Asn Asp Ser Phe Ser Thr Glu Asp Phe Ser Asn Cys Leu 

645 650 655 660 

tac cag gac ttt aga att cct ctt agt cct gtc cat aaa tgt tea ttt 2191 

Tyr Gin Asp Phe Arg He Pro Leu Ser Pro Val His Lys Cys Ser Phe 
665 670 675 

tat aaa aat aac acc aaa gtg agt tac ggg ttc etc tec cca cca caa 2239 

Tyr Lys Asn Asn Thr Lys Val Ser Tyr Gly Phe Leu Ser Pro Pro Gin 
680 685 690 

eta aat aaa aat tea agt gga ata tat tct gaa get ttg ctt act aca 2287 

Leu Asn Lys Asn Ser Ser Gly He Tyr Ser Glu Ala Leu Leu Thr Thr 

695 700 705 

aat ata gtg cca atg tac cag agt ttt caa gtt ata tgg cgc tac ttt 2335 

Asn He Val Pro Met Tyr Gin Ser Phe Gin Val He Trp Arg Tyr Phe 
710 715 720 

cat gac acc eta ctg cga aag tat get gaa gaa aga aat ggt gtc aat 2383 

His Asp Thr Leu Leu Arg Lys Tyr Ala Glu Glu Arg Asn Gly Val Asn 

725 730 735 740 

gtc gtc agt ggt cct gtg ttt gac ttt gat tat gat gga cgt tgt gat 2431 

Val Val Ser Gly Pro Val Phe Asp Phe Asp Tyr Asp Gly Arg Cys Asp 

745 750 755 

tec tta gag aat ctg agg caa aaa aga aga gtc ate cgt aac caa gaa 2479 

Ser Leu Glu Asn Leu Arg Gin Lys Arg Arg Val He Arg Asn Gin Glu 
760 765 770 

att ttg att cca act cac ttc ttt att gtg eta aca age tgt aaa gat 2527 

He Leu He Pro Thr His Phe Phe He Val Leu Thr Ser Cys Lys Asp 
775 780 785 

aca tct cag acg cct ttg cac tgt gaa aac eta gac acc tta get ttc 2575 

Thr Ser Gin Thr Pro Leu His Cys Glu Asn Leu Asp Thr Leu Ala Phe 
790 795 800 

att ttg cct cac agg act gat aac age gag age tgt gtg cat ggg aag 2623 

He Leu Pro His Arg Thr Asp Asn Ser Glu Ser Cys Val His Gly Lys 

80S 810 815 820 

eat gac tec tea tgg gtt gaa gaa ttg tta atg tta cac aga gca egg 2671 

His Asp Ser Ser Trp Val Glu Glu Leu Leu Met Leu His Arg Ala Arg 
825 830 835 

ate aca gat gtt gag cac ate act gga etc age ttc tat caa caa aga 2719 

ile Thr Asp Val Glu His He Thr Gly Leu Ser Phe Tyr Gin Gin Arg 
840 845 850 

aaa gag cca gtt tea gac att tta aag ttg aaa aca cat ttg cca acc 2767 

Lys Glu Pro Val Ser Asp lie Leu Lys Leu Lys Thr His Leu Pro Thr 
855 860 865 
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ttt age caa gaa gac tga tatgtttttt atccccaaac accatgaatc 2815 
Fhe Ser Gin Glu Asp 
870 



tttttgagag 


aaccttatat 


tttatatagt 


cctctagcta 


cacta-t-tgca 


ttgttcagaa 


2875 


actgtcgacc 


agagttagaa 


cggagccctc 


ggtgatgcgg 


acatctcagg 


gaaacttgcg 


2935 


tactcagcac 


agcagtggag 


agtgttcctg 


ttgaatcttg 


cacatatttg 


aatgtgtaag 


2995 


cattgtatac 


attgatcaag 


ttcgggggaa 


taaagacaga 


ccacacctaa 


aactgccttt 


3055 


ctgcttctct 


taaaggagaa 


gtagctgtga 


acattgtctg 


gataccagat 


atttgaat.ct 


3115 


ttcttactat 


tggtaataaa 


ccttgatggc 


attgggcaaa 


cagtagactt 


atagtaggg-b 


3175 


tggggtagcc 


catgttatgt 


gactatcttt 


atgagaattt 


taaagtggtt 


ctggatatct 


3235 


tttaacttgg 


agtttcattt 


cttttcattg 


taatcaaaaa 


aaaaattaac 


agaagccaaa 


3295 


atacttctga 


gaccttgttt 


caatctttgc 


tgtatatccc 


ctcaaaatcc 


aagttattaa 


3355 


tcttatgtgt 


tttcttttta 


attttttgat 


tggatttctt 


tagatttaat 


ggttcaaatg 


3415 


agttcaactt 


tgagggacga 


tctttgaata 


tacttaccta 


-t-bataaaa-tc 


ttactttgta 


3475 


tttgtattta 


a 










3486 



<210> SEQ ID NO 4 

<211> LENGTH: 873 

<212> TYPE: PRT 

<213> 0RGAI4ISM: H. sapiens 

<400> SEQUEKCE: 4 

Met Asp Val Gly Glu Glu Pro Leu Glu Lys Ala Ala Arg Ala Arg Thr 
15 10 15 

Ala Lys Asp Pro Asn Thr Tyr Lys Val Leu Ser Leu Val Leu Ser Val 

20 25 30 

Cys Val Leu Thr Thr He Leu Gly Cys He Phe Gly Leu Lys Pro Ser 
35 40 45 

Cys Ala Lys Glu Val Lys Ser Cys Lys Gly Arg Cys Phe Glu Arg Thr 
50 55 60 

Phe Gly Asn Cys Arg Cys Asp Ala Ala Cys Val Glu Leu Gly Asn Cys 
65 70 75 80 

Cys Leu Asp Tyr Gin Glu Thr Cys He Glu Pro Glu His He Trp Thr 
85 90 95 

Cys Asn Lys Phe Arg Cys Gly Glu Lys Arg Leu Thr Arg Ser Leu Cys 
100 105 110 

Ala Cys Ser Asp Asp Cys Lys Asp Gin Gly Asp Cys Cys He Asn Tyr 
115 120 125 

Ser Ser Val Cys Gin Gly Glu Lys Ser Trp Val Glu Glu Pro Cys Glu 
130 135 140 

Ser He Asn Glu Pro Gin Cys Pro Ala Gly Phe Glu Thr Pro Pro Thr 
145 150 155 160 

Leu Leu Phe Ser Leu Asp Gly Phe Arg Ala Glu Tyr Leu His Thr Trp 
165 170 175 

Gly Gly Leu Leu Pro Val He Ser Lys Leu Lys Lys Cys Gly Thr Tyr 
180 185 190 

Thr Lys Asn Met Arg Pro Val Tyr Pro Thr Lys Thr Phe Pro Asn His 
195 200 205 

Tyr Ser He Val Thr Gly Leu Tyr Pro Glu Ser His Gly He He Asp 
210 215 220 



Asn Lys Met Tyr Asp Pro Lys Met Asn Ala Ser Phe Ser Leu Lys Ser 
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225 



230 



235 



240 



Lys Glu Lys Phe Asn Pro Glu Trp Tyr Lys Gly Glu Pro lie Trp Val 



Ser Asp Val Glu lie Asn Gly lie Phe Pro Asp He Tyr Lys Met Tyr 
275 280 285 

Asn Gly Ser Val Pro Phe Glu Glu Arg He Leu Ala Val Leu Gin Trp 
290 295 300 

Leu Gin Leu Pro Lys Asp Glu Arg Pro His Phe Tyr Thr Leu Tyr Leu 
305 310 315 320 

Glu Glu Pro Asp Ser Ser Gly His Ser Tyr Gly Pro Val Ser Ser Glu 

325 330 335 

Val He Lys Ala Leu Gin Arg Val Asp Gly Met Val Gly Met Leu Met 
340 345 350 

Asp Gly Leu Lys Glu Leu Asn Leu His Arg Cys Leu Asn Leu He Leu 
355 360 365 

He Ser Asp His Gly Met Glu Gin Gly Ser Cys Lys Lys Tyr He Tyr 
370 375 380 

Leu Asn Lys Tyr Leu Gly Asp Val Lys Asn He Lys Val He Tyr Gly 
385 390 395 400 

Pro Ala Ala Arg Leu Arg Pro Ser Asp Val Pro Asp Lys Tyr Tyr Ser 
405 410 415 

Phe Asn Tyr Glu Gly He Ala Arg Asn Leu Ser Cys Arg Glu Pro Asn 
420 425 430 

Gin His Phe Lys Pro Tyr Leu Lys His Phe Leu Pro Lys Arg Leu His 
435 440 445 

Phe Ala Lys Ser Asp Arg He Glu Pro Leu Thr Phe Tyr Leu Asp Pro 

450 455 460 

Gin Trp Gin Leu Ala Leu Asn Pro Ser Glu Arg Lys Tyr Cys Gly Ser 
465 470 475 480 

Gly Phe His Gly Ser Asp Asn Val Phe Ser Asn Met Gin Ala Leu Phe 
485 490 495 

Val Gly Tyr Gly Pro Gly Phe Lys His Gly He Glu Ala Asp Thr Phe 
500 505 510 

Glu Asn He Glu Val Tyr Asn Leu Met Cys Asp Leu Leu Asn Leu Thr 

515 520 525 

Pro Ala Pro Asn Asn Gly Thr His Gly Ser Leu Asn His Leu Leu Lys 
530 535 540 

Asn Pro Val Tyr Thr Pro Lys His Pro Lys Glu Val His Pro Leu Val 
545 550 555 560 

Gin Cys Pro Phe Thr Arg Asn Pro Arg Asp Asn Leu Gly Cys Ser Cys 
565 570 575 

Asn Pro Ser He Leu Pro He Glu Asp Phe Gin Thr Gin Phe Asn Leu 

580 585 590 

Thr Val Ala Glu Glu Lys He He Lys His Glu Thr Leu Pro Tyr Gly 
595 600 605 

Arg Pro Arg Vol Leu Gin Lys Glu Asn Thr He Cys Leu Leu Ser Gin 
610 615 620 

His Gin Phe Net Ser Gly Tyr Ser Gin Asp He Leu Net Pro Leu Trp 
625 630 635 640 



245 



250 



255 



Thr Ala Lys Tyr Gin Gly Leu Lys Ser Gly Thr Phe Phe Trp Pro Gly 
260 ' 265 270 



Thr Ser Tyr Thr Val Asp Arg Asn Asp Ser Phe Ser Thr Glu Asp Phe 
645 650 655 
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Ser Asn Cys Leu Tyr Gin Asp Phe Arg lie Pro Leu Ser Pro Val His 
660 665 670 

Lys Cys Ser Phe Tyr Lys Asn Asn Thr Lys Val Ser Tyr Gly Phe Leu 
675 680 685 

Ser Pro Pro Gin Leu Asn Lys Asn Ser Ser Gly lie Tyr Ser Glu Ala 
690 695 700 

Leu Leu Thr Thr Asn lie Val Pro Met Tyr Gin Ser Phe Gin Val lie 
705 710 715 720 

Trp Arg Tyr Phe His Asp Thr Leu Leu Arg Lys Tyr Ala Glu Glu Arg 
725 730 735 

Asn Gly Val Asn Val Val Ser Gly Pro Val Phe Asp Phe Asp Tyr Asp 
740 745 750 

Gly Arg Cys Asp Ser Leu Glu Asn Leu Arg Gin Lys Arg Arg Val lie 
755 760 765 

Arg Asn Gin Glu lie Leu lie Pro Thr His Phe Phe lie Val Leu Thr 

770 775 780 

Ser Cys Lys Asp Thr Ser Gin Thr Pro Leu His Cys Glu Asn Leu Asp 
785 790 795 800 

Thr Leu Ala Phe lie Leu Pro His Arg Thr Asp Asn Ser Glu Ser Cys 
805 810 815 

Val His Gly Lys His Asp Ser Ser Trp Val Glu Glu Leu Leu Met Leu 
820 825 830 

His Arg Ala Arg lie Thr Asp Val Glu His lie Thr Gly Leu Ser Phe 

835 840 845 

Tyr Gin Gin Arg Lys Glu Pro Val Ser Asp lie Leu Lys Leu Lys Thr 
850 855 860 

His Leu Pro Thr Phe Ser Gin Glu Asp 
865 870 



<210> SEQ ID NO 5 

<211> LENGTH: 646 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME/KEY: promoter 

<222> LOCATION: (0)...(0) 

<223> OTHER INFORMATION: PC-1 promoter sequence 
<220> FEATURE: 

<221> NAME/KEY: misc.feature 

<222> LOCATION: (!)...( 646) 

<223> OTHER INFORMATION: n = A,T,C or G 



<400> SEQUENCE: 5 



aaaccaacgt 


agngacgtgg 


gaatcgaaat 


atccttaggt 


gtgttcagta tatgtgaacc 


60 


cacgtatttt 


aagtggacga 


tttctctctc 


agagtaccgt 


aggtagtggg ggacggggcg 


120 


cagaggggga 


gaaacagaaa 


gtcgccttcc 


tccatggttc 


atttgcattt ccatccagaa 


180 


actcacaggt 


cgaccccaag 


actccactct 


ctcccgcctt 


tgagaagccg gaccggcatc 


240 


ggcggctgca 


tccttctcct 


cctccccgct 


ctattttggg 


gccccatgat ctcatgcctt 


300 


ctgcagacca 


cacgctgcaa 


ttccagccca 


gcccgcgccg 


cgaggccacg cagggcgatt 


360 


cctgcaagtg 


tcgggagggt 


ggccggggcg 


cggggagggg 


acggcttggg gggaagttta 


420 


agacacgccc 


acgtaaggga 


cccaaaataa 


ccgacacaca 


gagtgcccga aatcagacag 


480 


gaagccaaat 


aatccggggc 


gttgagtcgc 


tttgccctga 


ctgcgagagc cgggtgtagg 


540 


gcggggagcc 


aaggatctga 


ccgcgagggg 


cgggcgcggc 


ggggaggggc ggggcggggc 


600 
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gggcggcgcg gggcctatta aaggcgcggc ggggcagcgg ggccgg 646 

<210> SEQ ID NO 6 

<211> LENGTH: 350 

<212> TYPE: DNA 

<213> ORGAIQISM: H. sapiens 

<220> FEATURE: 

<221> NAME/KEY: 3'UTR 

<222> LOCATION: (0)...(0) 

<223> OTHER INFORMATION: allele "A" 

<400> SEQUENCE: 6 



agccaagaag actgatatgt tttttatccc caaacaccat. gaatcttttt gagagaacct 60 

tatattttat atagtcctct agctacacta ttgcattgtt cagaaactgt cgaccagagt 120 

tagaacggag ccctcggtga tgcggacatc tcagggaaac ttgcgtactc agcacagcag 180 

tggagagtgt tcctgttgaa tcttgcacat atttgaatgt gtaagcattg tatacattga 240 

tcaagttcgg gggaataaag acagaccaca cctaaaactg cctttctgct tctcttaaag 300 

gagaag-bagc tgtgaacatt gtctggatac caga-batttg aatctttctt 350 



<210> SEQ ID NO 7 

<211> LENGTH: 350 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME /KEY: 3'UTR 

<222> LOCATION: (0)...(0) 

<223> OTHER INFORMATION: allele "P" 

<400> SEQUENCE: 7 



agccaagaag actgatatgt tttttatccc caaacaccat gaatcttttt gagagaacct 60 

tatattttat atagtcctct agctacacta ttgcattgtt cagaaactgt cgaccagagt 120 

tagaacagag ccctccgtga tgcggacatc tcagggaaac ttgcgtactc agcacagtag 18 0 

tggagagtgt tcctgttgaa tcttgcacat atttgaatgt gtaagcattg tatacattga 240 

tcaagttcgg gggaataaag acagaccaca cctaaaactg cctttctgct tctcttaaag 300 

gagaagtagc tgtgaacatt gtctggatac cegatatttg aatctttctt 350 



<210> SEQ ID NO 8 

<211> LENGTH: 350 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME/KEY: 3'UTR 

<222> LOCATION: (0)...(0) 

<223> OTHER INFORMATION: allele "N" 

<400> SEQUENCE: 8 



agccaagaag actgatatgt tttttatccc caaacaccat gaatcttttt gagagaacct 60 

tatattttat atagtcctct agctacacta ttgcattgtt cagaaactgt cgaccagagt 120 

tagaacagag ccctcggtga tgcggacatc tcagggaaac ttgcgtactc agcacagtag 180 

tggagagtgt tcctgttgaa tcttgcacat atttgaatgt gtaagcattg tatacattga 240 

tcaagttcgg gggaataaag acagaccaca cctaaaactg cctttctgct tctcttaaag 300 

gagaagtagc tgtgaacatt gtctggatac cagatatttg aatctttctt 350 



<210> SEQ ID NO 9 

<211> LENGTH: 23 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 
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<400> SEQUEKCE: 9 



ctgtgttcac tttggacatg ttg 



23 



<210> SEQ ID NO 10 

<211> LENGTH: 22 

<212> TYPE: DNA 

<2ia> ORGANISM: H. sapiens 

<400> SEQUENCE: 10 

gacgttggaa gataccaggt tg 22 



<210> SEQ ID NO 11 

<211> LENGTH: 109 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME/KEY: misc.f eature 

<222> LOCATION: (1)...(109) 

<223> OTHER INFORMATION: n = A,T,C or G 

<400> SEQUENCE: 11 

ctctcgctgg taggtccgcg gccaggcccc ggcgcccggg agggctggga atacngggag 60 
ggcggcgccg agctcctgcg ctctcagcgc actcagcacc gggcacgga 109 



<210> SEQ ID NO 12 

<211> LENGTH: 279 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 12 

tgagctccac cgggccggcg gccgc1:ctag aact:agtgga tcatgccact gtaccctagc 60 

ctgggtaaca gagtaagaca ctatcbctaa aaa-baaaaaa taaga-baaaa tattttttaa 120 

aaaagaaacc atgtaatttt ctcttttctc cctacaggta ttgagaaggt aattaggtgt 180 

gtgtgtgtgt gtgtg-bgtgt gtgtgtgtgt g-ta-tgtgtgt gcacagcctt attaagaa-tg 240 

tgattgaggt aaacattatc tccta-ttccc aaggggtac 279 



<210> SEQ ID HO 13 

<211> LENGTH: 243 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 13 

agatttttgc cttactttat taccccatct gtattttcta aagtagtatt tgaacctagt 6 0 

gtacacctaa cttagttgtn ttcgttgatg tttactttga attotataat gattagaaac 12 0 

atctgactta tcgttcaatt ttttcagtta accaggtaag gatgagcagg gaaaaaag1:g 180 

gagttatggt cattaggaaa agatccacta gttctagagc ggccgccacg cccggtggag 240 



<210> SEQ ID NO 14 

<211> LENGTH: 231 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<220> FEATURE: 

<221> NAME/KEY: misc.feature 

<222> LOCATION: (1)...(231) 

<223> OTHER INFORMATION: n - A,T,C or G 

<400> SEQUENCE: 14 



ctt 



243 
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cgcggcggcc gttctagaac tagtggatca tactcacgaa gacagcaatt ctgtgttcac 60 

tttggacatg ttgaatttga gacataaaac acattttgct gatgtttgtt tctagaacat 120 

agtcaaggtc aggtgctcgt tgggctctgc agcaacctgg tatcttccaa cctcttaacg 180 

gggctn-taca taagtgttat cttttatatt aagantcatg gctat-tgggc c 231 



<210> SEQ ID NO 15 

<211> LENGTH: 313 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 15 



aatctgttca catactttgt ttgtggaatc tgtcttaatg tgtctcacaa gcatcacaat 60 

tattattact gttaagtgtg ttcattttat tttcttgaaa atattttagg tgagaagcag 120 

ggtaagatta tattctgagg tattaatttt ttctttttta gaagtacagc atcatttttt 180 

tctttccaaa ttaagatgat aaaaa-taata aaatcactgg tttattaaac attacaggtt 240 

gagtatcctt tatccaaaat gtttggtatg agaactgttt tggattttgg acttttttgg 300 

attttgcaat att 313 



<210> SEQ ID NO 16 

<211> LENGTH: 313 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 16 



ccgcagcccg ggggatcaca cagaccttag tggaaaatct tcactggacc tgtgccaaga 60 

agggggtaca tcttcattgg atatgtcttg tctttgcttc tttaaacatt tttttttctt 120 

tttcattacc caggtt-bgaa actaagligag taacttcaga gtttactgct ggaatatcac 180 

catttcagtg agattgacta ggcaggcagt ctttcttgga aaagtactgg cagaacctaa 240 

ctgtttcact aaacttttct aatgggcaaa gtagttgaac cttgtgtagg gcgccttatc 300 

tttaataatg tga 313 



<210> SEQ ID NO 17 

<211> LENGTH: 382 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 17 



taagagaaaa atgaagtcat 


ctttaagatt 


ggatttgtat 


ccacagtgtt 


gctttataat 


60 


tcatcctgaa tttttatctg 


attaaaatcc 


ctcctgggta 


atttttttta 


cgtgatttag 


120 


actgctgtgg taccactgct 


aaatgaggta 


agccaattg-t 


cagatgtatt 


taataacaat 


180 


gtttattttt ttcccttcta 


gaaaaatgtt 


caccgtaagc 


tctgcatttc 


aacttctatc 


240 


tgtttgaaga agtgagatgg 


gattgtaaca 


ttttttgagg 


gaatagattt 


aagataaaag 


300 


aaaaacaact tattttccaa 


taggtagtta 


ag-taaggaaa 


cccaggttct 


gatctttgct 


360 


ctgccacaaa ctagctgtgg 


ct 








382 



<210> SEQ ID NO 18 

<211> LENGTH: 312 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 18 



actacataaa atcttaagag gttgcgtttt gccattacct gatttttttg tttttctttc 



60 
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cttaaactta ttataattcc atgtagcttc 


agttatcggt 


ttctttttga 


tgattttttt 


120 


ctgtgaatgt atttaacatt aagtaaacac 


aacttgcata 


taatctgttt 


tatctttttt 


180 


agggattaac cagtgagttc tttgtttttc 


tactaaaata 


gttaattatt 


ctcatctatt 


240 


tcaatcagag taaaataacc agattctcta 


gagcttttaa 


taactgattt 


catttagtgt 


300 


gtctgtggcc at 








312 


<210> SEQ ID NO 19 

<211> LENGTH: 382 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 










<400> SEQUENCE: 19 










taatctctga ctatttaata tgttgttgct 


gcttaagagt 


catattacat 


gattattgtc 


60 


gtctaagtgc tgaagcttgt tgaccttaaa 


age at tc tag 


cactagagag 


gaatgcattg 


120 


gtgtggtatg aaaacatact ttcctaagag 


atgaatgttg 


catgatttct 


taattttcct 


180 


tcattttctg ctccagattt ggaatgggta 


tgtgaaatga 


attttttcta 


ggatctgtaa 


240 


tatagaacag cttattctta tgtaatctcc 


tttttattga 


atcctgagct 


ttagcatttg 


300 


agtgatatgt tggctgaaaa atgagaactg 


aagaactctt 


tctcaaagag 


tttagataga 


360 


tggtaaatgg acagtaaaac ta 








382 


<210> SEQ ID NO 20 
<211> LENGTH: 307 

<213> ORGANISM: H. sapiens 










<400> SEQUENCE: 20 










gggaaaataa agttttcaaa taaaaccctt 


gatttcaaac 


acaatagatg 


cgaaatagca 


60 


tttactagct cttaatgaca ttttcaatga 


aaaaaactat 


attttacacc 


caaacaattg 


120 


tcagccatct tttatttttg tttgttcttc 


attttagttc 


agtaagatga 


aaggtctgta 


180 


ggcaattaat ttctattgta aatacttcgt 


tttgtagaaa 


tgatatacta 


ttttccccta 


240 


gactacaaca aaactttgct atttgctatg 


atgttttata 


tcgaaataaa 


ttctttagta 


300 


aatgatc 








307 


<210> SEQ ID NO 21 

<211> LENGTH: 329 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 










<400> SEQUENCE: 21 










gaatttcaaa gctgtaaatt atttctcagt 


agaactgtta 


caccagtgtt 


ataaaattta 


60 


atccctatca attgaggaat tattttttcc 


attctgtttt 


tcaatgtgtt 


cgtaaaatat 


120 


tacattttga tactgtttga tttagaccac 


acagtgagta 


agtacatttt 


tctcagtaat 


180 


tatttcatta aacccagtca tcaaactgaa 


cctcgctttg 


aaggaggctg 


ctagaccatt 


240 


ttataagatt ctatcatttc tggaaaaagc 


aagtattata 


cacaatatta 


ctaaatataa 


300 



ggatgcactt taaacaaaat aagagttgg 329 



<210> SEQ ID NO 22 

<211> LENGTH: 381 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 
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<400> SEQUENCE z 22 



gtcttagttt aatgtgaatc agctcattgt 


agttgcatcc 


actggcccaa 


atctatcaat 


60 


ctgtcggtct ttctttcttt ctttgtttct 


ttcttttttt 


ttttttttaa 


cagagatagc 


120 


tttatgtata aatagccatt agtgtggaag 


gtatcacatg 


aggttgtgct 


tcccattctt 


180 


aggtcatcat catggtaatc tgaatttgca 


ttatttactc 


ttcaggataa 


agggctgaag 


240 


aaagtttact tgatggtttc ccaatttttt 


ctgaatgttg 


tagttaattc 


ttttttaaaa 


300 


atgtagtttc ttatggacag tctttaggaa 


aaaaatacat. 


taaatataaa 


atataagtga 


360 


aacacagaat tcacagaaac c 








381 


<210> SEQ ID NO 23 
<211> LENGTH: 382 
<212> TYPE: DNA 










<400> SEQUENCE: 23 










gattttgaaa aaagtgaagt gataggtaca 


gctgaaattc 


tgtcttacct 


atcagatctt 


60 


caactaatat gagtgctaca cccatgttta 


acgaatttaa 


ccttggaagt 


gaaagaagtt 


120 


ctgctctgca tattaaattt tttgttaaag 


ttacagcatg 


ttttgggatt 


ttttttttct 


180 


cctaggcatg gtactattca tgtaagtata 


tctctgtgat 


aactttgaat 


atggtcatat 


240 


taagaatacc ttcctttagg ccgggcacag 


tggctcatgc 


ctgtaatcgc 


agcactttgg 


300 


gaggccaaag tgggtggtca cctgaggtca 


ggagttcgag 


accagcctgg 


ccaacatggt 


360 


gaaaccctaa aaatacatat ac 








382 


<21Q> SEQ ID NO 24 

<211> LENGTH: 361 

<212> TYPE: DNA 

<zlj> ukijAniom: h. sapxens 










<400> SEQUENCE: 24 










gatccaaact ctgcatttaa ataccaaggc 


aggttttaaa 


gagttcattt 


aagtcattac 


60 


attgtagcca ctgaaaggaa ttagacagac 


ctttagggat 


ctgacattct 


atatttttgt 


120 


attatgtttt aattatagta tacaatcaac 


tattaattct 


tatgtttgtt 


cccctccag-t 


180 


taactatgaa ggcattgccc gaaatctttc 


tgtgagtatc 


tttattttcc 


attatctagt 


240 


tatttttact tttgtataat atatattgag 


agaaaagttt 


cagcatctat 


tattgggatt 


300 


gaaggattag aatattttag taatctgggc 


caacatggaa 


atgctgtgta 


gtttaaagat 


360 


c 








361 


<210> SEQ ID NO 25 

<211> LENGTH: 384 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 










<400> SEQUENCE: 25 










ctgatgaaat gtttgtgaaa aaaaatttca 


tatgaagtta 


gaaagcaatt 


tcaagaaaag 


60 


ttgacacttt ttatagatat tagggaaata 


tctttcccta 


ataaatatct 


ttccctaaaa 


120 


aagttgacac ttttttagat attagggaaa 


taatagtttt 


tctttgctgt 


ttgcaatttc 


180 


agtgccgggg cattgtaagt tctgacagtc 


tcccaggtaa 


acttagtctg 


atcggttagt 


240 


gattcagggt aaccattggg ccctttctaa 


caatattgtt 


atgtgaaaac 


tgtataagta 


300 


tgattctctt cactctaacc caggatttct 


aatgtcggcc 


tatggatgtt 


tgagttagat 


360 
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aattctttgt tgtggagagc tgtc 384 

<210> SEQ ZD NO 26 

<211> LENGTHS 381 

<212> TYPE: DNA 

<213> ORCSAKISH: H. sapiens 

<400> SEQUENCE: 26 



aaaagataga ggtgacttct taatgctttt caaagccagg tggttttatt taccgttgtg 60 

ttggtttaac aaaatagtta catacttttt aatcaatgaa aataatgtta tgattatcaa 120 

ttatgtttta tgaaaggact ttacattttt aattcatata tgtcaacatt aggaatgcaa 180 

gtgagtaaac ctattatact taattggatt aaatctaaag aaaaaatgat atgcaaagtt 240 

ttagacttga aaacatactg tgattatatg tcttgaatga gaattaatgg aacatacttt 300 

cataaagcta tttttctttg aacattaaag aa-ttttgtta aagttttata ttcattggct 360 

attactaaaa agtcaaaaaa c 381 



<210> SEQ ID NO 27 

<211> LENGTH: 383 

<212> TYPE: DNA 

<:213> ORGANISM: H. sapiens 

<400> SEQUENCE: 27 



aaaactaaga gacctatcct agatgtcctt agattatgtg tgtgataggg ttaaaactat 60 

atttcccaca aagtccactg agcgtggtag ttttcctctt atcttatcat aaccagtttg 120 

tatatgtaca atgtggataa cagaattttt gggaccaact tgtagacagc tgaaatgcac 180 

tgataaactt ccttttctgg ccatctaggc cctgtgtggt aagtgtgaac aggtgccttt 240 

tttcccttct gaaaatagac ctgaaa-bagg a-btatcaaaa gcaggtcaca ttgtaggcaa 300 

ctttgtggag atgatggtga ggcaagacag at.ttttacct tcttcctgac tctcagactc 360 

actgaagaaa tgtggggaac atg 383 



<210> SEQ ID NO 28 

<211> LENGTH: 384 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 28 



catatcagta tttctattaa aaataaccta gtcttaaata ctctaaaacc caagagagtt 60 

ttatactttt at-tttagtta aagagtaaat gactcatgta tttggtttta aaaaagtaaa 120 

gatcatggca caagtctact atttgtttga tttgaaacat ctaagtaact ctaccatctt 180 

gaaattatgc agatttactt cggtaagtat cgtcaagaag tttggtccag tatgtatggt 240 

ttgatagcac cctctgcata gcatgtgctg taaaaatact taataatcaa attagaattt 300 

aggagtgggg gtaggtaaac atatgtttta attctagggg gcgcatgtaa atcttttgtg 360 

atatatcttt tctctttcta gttt 384 



<210> SEQ ID NO 29 

<211> LENGTH: 339 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 29 



gtgaaagagc aacactcttg ccttgaaaga gaaaaaaaaa tccactaata caagactatc 



60 



us 6,465,185 Bl 
67 68 

-continued 

ataaatgatc tttgttctat gttggaataa tcaatctata gcggtctatg ttacaaaatt 120 

taaaacatgt ctctcagtcc ttacaaatag ttttataacc ttttttcaga ttttgccgaa 180 

ggtaaggcat gctacacact caagctcgga atgtgaagca ggcattttct catcagtgtg 240 

aaatgcagag aactggcttg ggggtattat ttgagaataa ccaataaaat aaagggagtt 300 

ctggaggacc acctgatgaa acatagaggt ttctttgct 339 



<210> SEQ ID NO 30 

<211> LENGTH: 327 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 30 



gtcttcttaa ttgtttatgc ttgtaccctt tgtaatcagt ttttttaata gttaaaagta 60 

aatcttcaat ataattaagt agaggaaagg attagatgag tgtatcacac tatatattat 120 

catataatgc acactaacta catttatttt catcctgtga cccaagagaa gattagacag 180 

aaatgcaagt atttgtcacc tctttatgtg tggccatttc aaattaatga ttaagcagaa 240 

cattaaatgc atagtttctc actgttcacc ttggctttat actcagttcc cgcattagag 300 

gaacactgaa gagggagtca gaaaaat 327 



<210> SEQ ID NO 31 

<211> LENGTH: 427 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 31 



tttaatattg taaagcattt ttacacttta gt-tagaaaaa aagatgaata tactagtagg 60 

aaaataggga aggacatgag ctgacagcta gagc-ttcata attttatgat gtagttcacc 120 

tttaaatatt aataaagcaa ttttcttctc tgtgcctgat atctgagagt tcttctcatt 180 

ttcgttcttc aggacaccac cacgtaagtt ttttcctctc ctgaccttcc cttttctcct 24 0 

ttttgttttc tttcttgttt ataaatccta ccatacatta tagggtaata tatatattac 300 

ctattatata tatataaaat attacctat-k ttata-tatal: a-ttatatata taatatatat 360 

aaagtatata tattactatt ttatatatat atagctatat atatatacct ttgtttattt 420 

attgtga 427 



<210> SEQ ID NO 32 

<211> LENGTH: 380 

<212> TYPE; DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 32 



ctcatcttga aaagacttct taaatatttt atttttgtaa aggacttgac caaacacata 60 

acattttccc tcgaccctgt acttgggaaa g-t-tttacagg tttaagatgg tactcagcta 120 

atttttaaaa atgctcccct aaccatgaga aagtataatt tcctatgtta tttgtgaaga 180 

atgaaaaagt tg-tcctcttt tctctttgta gaactattca aggtaaataa tgttaactct 240 

atatttgata attttaatga atttgtgcac atataggcat aattcatatg tataggactt 300 

atggtctaaa ttaaatgaat taataccaaa tacattctta aaggtttaac tttgagaata 360 

ctagtacaca aaaattctac 380 



<210> SEQ ID NO 33 
<211> LENGTH: 384 
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<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 33 

ctgggtgata tagcacgact ctgtctctaa acaaaaaaca aaacaaaacg aagactgaag 60 

ccaaacttga ctttatcttt atttactata aatgctaatt ttgaatcatg gtgttaattt 120 

atttcacacg tcaacatggt cccttgttct tttgaaacta cactggcttc tatcttgttt 180 

cagttataga ggcagtaaga acatatttca ttactcttaa aaataggaat taccatccag 240 

tagaaatggg attaccatcc agttgagtca acagaacctt ttttatccag tgtcgtatgt 300 

ttatgtgtat gacacttctg actacacagg aagcctcttg aaatatctga ttaattttga 360 

tgt-tttgctc aatgttcagt aaaa 384 

<210> SEQ ID NO 34 

<211> LENGTH: 328 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 34 

gttcttatat ttaattattg gttggaattt gatttttata tgtattaaaa gcatgctcta 60 

ctgaaatatt catcaaaagg aagatagtta tttctttctt aaaatgaata ttggcatgtt 120 

ttacagaaaa atgtgtggta agtagctttt gtatatttac tttgcatgtt gaaaatctag 180 

acatatgcat atttgtttat gtcacccatc tgacattaca gtgagagaaa gcacaactga 240 

gtacacatgg acttcgaaat tataggatgc ttttaaattt gatcttttaa gatgacatat 300 

ctttggggaa gactaccctg tctgcttt 328 

<210> SEQ ID NO 35 

<211> LENGTH: 384 

<212> TYPE: DNA 

<213> ORGANISM: H. sapiens 

<400> SEQUENCE: 35 

aattaaacaa acatgcatgg tatgtattag aaggaaagct actcaagagg agagatgatg 60 

cctaacaaat catgtggcac gttccacttc agagctgaaa tctcgtaaat gattaaactg 120 

gggagatgga gcacttatag aagtgaac-tg agtgttctct tggtaacttt tcttttatat 180 

t-bcctattct cctagcatgg atttaaaaaa gaaaaatatt cctatcctgc tcactggtaa 240 

ttaacatagg tttaaaatgg cttcaaatgt ggccctatag acggttaaaa ttgtacctta 300 

tcttggcaaa acttcagagc accagtcagt gcatgcaagg tgccattttt tattgagatg 360 

cttagaatgt ttctttctgt gcac 384 



What is claimed is: 

1. An isolated nucleic acid encoding the amino acid 55 
sequence of SEQ ID NO: 4. 

2. An isolated oligonucleotide consisting of at least 18 
contiguous nucleotides from an isolated nucleic acid encod- 
ing SEQ ID NO: 4, wherein said oligonucleotide includes a 
portion of the nucleic acid sequence that encodes a 60 
glutamine at position 121 of SEQ NO: 4. 

3. A method for detecting a predisposition to insulin 
resistance in an individual, the method comprising: 



analyzing an individual for the presence of a genetic 
polymorphism in the genomic sequence of a human 
PC-1 allele, wherein said human PC-1 allele encodes a 
glutamine amino acid at position 121 of SEQ ID N0:4, 
and wherein the presence of said glutamine at position 
121 of SEQ ID NO: 4 is indicative of a predisposition 
to insulin resistance. 

♦ * * * 4« 
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ASSOCIATED WITH AUTISM SPECTRUM 
DISORDERS 

The present application claims the benefit of U.S. Pro- 
visional Patent Application Sen No. 60/049,803, filed Jun. 
17, 1997. 

The subject matter of this application was made with 
support from the United States Government under Grants 
No. RO1AA08666, ROl NS 24287, R01HD34295, 
R01HD34969, and 2P30 ES01247 from the National Insti- 
tutes of Health and Grant No. R824758 from the Environ- 
mental Protection Agency. 'ITie United States Government 
may retain certain rights. 

FIELD OF THE INVENTION 

The present invention relates to a method of screening 
subjects for genetic markers associated with autism. The 
invention further relates to isolated nucleic acids having 
polymorphisms associated with autism, the polypeptide 
products of those nucleic acids, and antibodies specific to the 
polypeptides produced by the mutated genes. 

BACKGROUND OF THE INVENTION 

Autism is a behaviorally defined syndrome characterized 
by impairment of social interaction, deficiency or abnormal- 
ity of speech development, and limited activities and interest 
(American Psychiatric Association, 1994). The last category 
includes such abnormal behaviors as fascination with spin- 
ning objects, repetitive stereotypic movements, obsessive 
interests, and abnormal aversion to change in the environ- 
ment. Symptoms are present by 30 months of age. The 
prevalence rate in recent Canadian studies using total ascer- 
tainment is over 1/1,000 (Bryson, S. E. et al., J. Child 
Psychol Psychiat,, 29, 433 (1988)). 

Attempts to identify the cause of the disease have been 
diflBcuh, in part, because the symptoms do not suggest a 
brain region or system where injury would result in the 
diagnostic set of behaviors. Further, the nature of the behav- 
iors included in the criteria preclude an animal model of the 
diagnostic symptoms and make it difiBcult to relate much of 
the experimental literature on brain injuries to the symptoms 
of autism. 

Several quantitative changes have been observed in autis- 
tic brains at autopsy. An elevation of about 100 g in brain 
weight has been reported (Bauman, M. L. and Kemper, T. L., 
Neurology 35, 866 (1985)). While attempts to find anatomi- 
cal changes in the cerebral cortex have been unsuccessful 
(Williams, R. S. et al.. Arch. Neurol., 37, 749 (1980); 
Coleman P. D., et al, /. Autism Dev. Disord., 15, 245 
(1985)), several brains have been found to have elevated 
neuron packing density in structures of the limbic system 
(Bauman, M. L. and Kemper, T. L., Neurology 35, 866 
(1985)), including the amygdala, hippocampus, septal nuclei 
and mammillary body. Multiple cases in multiple labs have 
been found to have abnormalities of the cerebellum. A 
deficiency of Purkinje cell and granule cell number, as well 
as reduced cell counts in the deep nuclei of the cerebellum 
and neuron shrinkage in the inferior olive, have been 
reported (Bauman, M. L. and Kemper, T. L., Neurology, 35, 
866 (1985); Bauman, M. L. and Kemper, X L., Neurology, 
36 (suppl. 1), 190 (1986); Bauman, M. L. and Kemper, T. L., 
The Neurobiology of Autism, Johns Hopkins University 
Press, 119 (1994); Ritvo, E. R. ct al., /. Psychiat., 143, 
862 (1986); Kemper, T. L. and Bauman M. L., Neurobiology 
of Infantile Autism^ Elsevier Science Publishers, 43 (1992)). 
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Imaging studies have allowed examination of some ana- 
tomical characteristics in living autistic patients, providing 
larger samples than those available for histologic evaluation. 
In general, these confirm that the size of the brain in autistic 
individuals is not reduced and that most regions are also 
normal in size (Piven, J. et aL, Biol. Psychiat., 31, 491 
(1992)). Reports of size reductions in the brainstem have 
been inconsistent (Gaffney, G. R. et al, Biol Psychiat., 24, 
578 (1988); Hsu, M. et ^UArch. Neurol 48, 1160 (1991)), 
but a new, larger study suggests that the midbrain, pons, and 
medulla are smaller in autistic cases than in controls 
(Hashimoto, T. ct al., J. Aut Dev. Disord., 25, 1 (1995)). In 
light of the histological effects reported for the cerebellum, 
it is interesting that the one region repeatedly identified as 
abnormal in imaging studies is the neocerebellar vermis 
(lobules VI and VII; Gaffney, G. R. et d\.,Am. J. Dis. Child, 
141, 1330 (1987); Courchesne E., et al., N Engl J. Med, 
318, 1349 (1988); Hashimoto, T. et al.,7. Aut. Dev. Disord., 
25, 1 (1995)). Not all comparisons have found a difference 
in neocerebellar size (Piven, J. et dX.^BioL Psychiat., 31, 491 
(1992); Kleiman, M. D. et al.. Neurology, 42, 753 (1992)), 
but a recent reevaluation of positive and negative studies 
(Courchesne, E. et al, Neurology, 44, 214 (1994)) indicates 
that a few autistic cases have hyperplasia of the neocerebel- 
lar vermis, while many have hypoplasia. Small samples of 
this heterogeneous population could explain disparate 
results regarding the size of the neocerebellum in autism. 
The proposal that the cerebellum in autistic cases can be 
either large or small is reasonable from an embryological 
standpoint, because injuries to the developing brain are 
sometimes followed by rebounds of neurogenesis (e.g., 
Andreoli, J. et al., A//i. J. Anat. 137, 87 (1973); Bohn, M. C. 
and Lauder, J. M., Dev. Neuroscl, 1, 250 (1978); Bohn, M. 
C, Neuroscience, 5, 2(K)3 (1980)), and it is possible that 
such rebounds could overshoot the normal cell number. 
Further, because increased cell density has been observed in 
the limbic system, the cerebellum is not the only brain 
region in which some form of overgrowth might account for 
the neuro-anatomy of autistic cases. It may well be that some 
autism-inducing injuries occur just prior to a period of rapid 
growth for the cerebellar lobules in question or the limbic 
system, leading to excess growth, while other injuries con- 
tinue to be damaging during the period of rapid growth, 
leading to hypoplasia. However, the hypothesis that autism 
occurs with both hypoplastic and hyperplastic cerebella calls 
into question whether cerebellar anomalies play a major role 
in autistic symptoms. 

A particulariy instructive result has appeared in an MRI 
study on the cerebral cortex (Piven, J. et al.,/4/«. J. Psychiat., 
14, 734 (1992)). Of a small sample of autistic cases, the 
majority showed gyral anomalies (e.g., patches of 
pachygyria). However, the abnormal areas were not located 
in the same regions from case to case. That is, while the 
functional symptoms were similar in all the subjects, the 
brain damage observed was not. The investigators argue 
convincingly that the cortical anomalies were not respon- 
sible for the functional abnormalities. This is a central 
problem in all attempts to screen for pathology in living 
patients or in autopsy cases. While abnormalities may be 
present, it is not necessarily true that they are related to the 
symptoms of autism. 

To teratologist.s, the physical anomalies of a neonate, 
child, or adufi can serve as a guide to when the embryo was 
injured. Years of research have amplified the details of that 
timetable for the nervous system (Rodicr, P. M., Dev Med. 
Child Neurol, 22, 525 (1980); Bayer, S. A. et al., 
Neurotoxicology, 14, 83 (1993)). In the case of autism, lack 
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of Specific information on the neuroanatomy associated with that one failed to form is conclusive in pinpointing the time 

the disease has made it difi5cult to estimate the stage of of injury. Like the thalidomide cases, the autopsy case could 

development when the disorder arises. However, in 1993, have been injured only at the time of neural tube closure. 

Miller and Stromland reported a finding that conclusively The effect of injury around neural tube closure has been 

identified the time of origin for some cases. They observed 5 tested experimentally, to see whether it can produce ana- 

that the rate of autism was 33% in people exposed to tomical results like those suspected in the thalidomide cases 

thalidomide between the 20th and 24th days of gestation, and observed in human brain. Animals expo.sed during the 

and 0% in cases exposed at other times (Stromland, K.etal., critical period to valproic acid, a teratogen with effects 

Devel. Med. Child. Neurol, 36, 351 (1994)). Their deduc- similar to thalidomide, which has also been associated with 

tion regarding the time of injury was not based on jq autism (Christiansen, A. L. et al., Devel Med. Child. 

neuroanatomy, which was not known in their living subjects. Neurol, 36, 357 (1994); Williams, P. G. et al, Dev. Med. 

Instead, it was based on the external stigmata of the cases. Child. Neurol, 39, 632 (1997)) exhibit reductions in the 

Because thousands of thalidomide -exposed offspring number of cranial nerve motor neurons (Rodier, P. M. et al., 

have been evaluated for somatic malformations, the array of Comp. Neurol, 370, 247 (1996)). They are distinguished 

injuries associated with the drug is well-known, and the time ^5 from controls by shortening of the hindbrain in the region 

when each arises has been carefully defined (Miller, M. T., which forms from the fifth rhombomere, just as the aulop- 

Trans.Am. Ophthalmol Soc, 89, 623 (1991)). Of five cases sied brain was (Rodier, P. M., et al.. Teratology 55, 319 

of thalidomide-induced autism, four had malformations of (1997)). Additional data suggests that the animal model has 

the ears, without limb malformation, and the fifth had secondary changes in the cerebellum like those reported in 

malformation of the ears, forelimb, and hiodlimb. Thalido- 20 ^^^^ human cases of autism (Ingram, J. L. et al.. Teratology, 

mide is not teratogenic before the 20lh day of gestation, 53, 86 (1996)). 

Starting on day 20 exposure causes ear malformation and It has long been known that heritable factors play an 

abnormalities of the thumb. Limb malformations (other than important role in the etiology of autism. This was demon- 

those of the thumb) first appear with exposure on the 25th stratcd by the original twin studies of Folstc in and Ruttcr (7. 

day, with effects moving from the forelimb to the hindlimb 25 Child Psychol Psychiat., 18, 297 (1977)) and the subsc- 

as exposure occurs at later stages. After the 35th day, quent addition of more twin pairs to the sample has only 

thalidomide produces no malformations. Thus, the cases increased the estimate of the proportion of cases suspected 

with malformations restricted to the ear must have been to have a genetic basis (e.g. Bailey, A. et al., Psychol Med., 

exposed before day 25, and the one patient with multiple 25, 63 (1995); LeCouteur, A. et al, J. Cfiild Psychol 

malformations can only be explained as a case of repeated 30 Psychiat., 37, 785 (1996)). Family studies of siblings 

injuries at several stages of development. (Smalley, S. L. et dl.fArch. Gen. Psychiat., 45, 953 (1988)) 

In fact, the idea that autism might arise very early in and parents (Landa, R. et al., 7. 5/;tWi //tw. /?c-5., 34, 1339 

gestation was suggested long ago. Steg and Rapoport (J.Aut. (1991); Landa, R. et al.. Psych. Med., 22, 245 (1992)) also 

Child. Schiz., 5, 299 (1975)) noted the significant increase in support the conclu.sion that an inherited risk is involved in 

minor physical anomalies among children with autism, and 35 many, perhaps all, cases of autism spectrum disorders. While 

realized that they indicated an injury in the first trimester. the rate of autism is elevated in close relatives of cases, the 

Several studies of minor malformations have found ear rate of symptoms short of the diagnosis is increased much 

effects to be the most common anomalies in autism (Walker, more. That is, individuals known to share genetic factors 

H. A., y. /4m. C/u7rf. 5c/h>, 7, 165 (1977); Campbell, M. et seem to vary in the degree to which symptoms arc 

al.. Am. /. Psychiat., 135, 573 (1978)), and the most recent 40 expressed. This non-Mendelian pattern (Jorde, L. B. et al., 

study shows that they are not only the best discriminator Am. J. Hum. Genet., 49, 932 (1991)) suggests a complex 

between people with autism and normal controls, but also disorder with major contributions from predisposing genetic 

the only anomaly that discriminates auti.sm from other factors, which interact with the overall genetic background 

developmental disabilities (Rodier, P. M. el al., Teratology and/or environmental insults to determine the phenotype. 

55, 319 (1997)). Ear anomalies are among the earliest of all 45 The ability to identify the genetic factors that increase the 

minor physical malformations in their time of origin. risk for autism would be a breakthrough for genetic coun- 

External malformations are not the only evidence which seling for prevention of the disorder. In addition, it would 

puts the time of injury in autism at the time of neural lube allow the creation of genetically-engineered animals in 

closure. The cranial nerve dysfunctions observed in the which to study the environmental factors that interact with 

patients with autism secondary to thalidomide exposure — 50 the inherited predispositions. Tests for genetic factors would 

facial nerve palsy, Duane syndrome (lack of abducens also serve as biomarkers, valuable for diagnosis, and useful 

innervation with reinnervation of the lateral rectus by the in research on all aspects of the autism spectrum, 

oculomotor nerve), abnormal lacrimation, gaze paresis, and Unfortunately, neither linkage nor association studies have 

hearing deficits (Stromland, K. et al, Devel Med. Child. revealed any chromosomal regions strongly related to 

Neurol, 36, 351 (1994))— suggest that the earliest-forming 55 autism (e.g. Spence, M. A. et al., Beliav. Genet,, 15, 1 

structures of the brain stem were damaged, and it is now (1985); Smalley, S. L. et hi. Arch. Gen. Psychiat., 45, 953 

known thai these form during neural tube closure (Bayer, S. (1988); Cook, E. H. et al., Molec. Psydiiat., 2, 247 (1997); 

A. et al., Neurotoxicology, 14, 83 (1993)). Subsequent Klauck, S. M. et aUHum. Molec. Genet., 6, 2233 (1997); 

studies have shown that a human brain from a patient with Cook, E. H. el al.. Am. J. Hum. Genet., 62, 1077 (1998)). 

autism has the same pattern of brain stem injury predicted by 60 Furthermore, while there is no known medical treatment 

the thalidomide cases (Rodier, P. M. et al.,7. Comp. Neurol, for auli.sm, some success has been reported for early inler- 

370, 247 (1996)). Perhaps even more importantly, the autop- vention with behavioral therapies. A biomarker would allow 

sied brain has a shortening of the brain stem in the region of identification of the disease, now typically diagnosed 

the fifth rhombomere, and is missing two of the nuclei between ages three and five, in infancy or prenatalUfe. Thus, 

known to form from that embryological structure. The 65 there is an urgicnt need for a method of reliably identifying 

rhombomeres exist so briefly (Streeter, G. L., Contn subjects with autism. In particular there is need for a blood 

Embryol Cameg. Instn., 30,213 (1948)) that the evidence test for polymorphisms causing autism spectrum disorders. 
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Families with affected members need to know whether they carried out on 30-40 subjects to be certain that the digestion 

carry a mutation which could affect future pregnancies. results match the sequencing results, demonstrating that the 

Clinicians need a test as an aid in diagnosis, and researchers digestion procedure detects the deviant sequence described 

would use the test to classify subjects according to the and no other, 

etiology of their disease. 5 

_ ^ DETAILED DESCRIPTION OF TIIE 

SUMMARY OF TIIE INVENTION INVENTION 

The present invention relates to a method for screening 

subjects for genetic markers associated with autism. A Th^ P^J^ent mvention provides a method for screemng 

biological sample is isolated from a mammal and then tested lo subjects for genetic maricers associated with autism. A 

for the presence of a mutated gene or a product thereof biological sample is isolated from a mammal and then tested 

which is associated with autism. ^'^^ presence of a mutated gene or a product thereof 

. . X r.t- • ♦ • 1 * J 1 • 1 which is associated with autism. 
Another aspect of the invention is an isolated nucleic acid 

encoding a HoxAl allele having a polymorphism which is Polymorphisms in Hox genes are shown to be associated 

associated with autism spectrum disorders. with autism spectrum disorders. The Hox genes arc a family 

Yet another aspect of the invention is an isolated nucleic of genes that function in the patterning of body structures 

acid encoding a HoxBl allele having a polymorphism which that develop along an anteroposterior axis, such as the limbs, 

is associated with autism spectrum disorders. skeleton, and nervous system; they are expressed during 

embryonic development at specific times in limited regions 

BRIEF DESCRIPTION OF THE DRAWINGS embryo. In the mouse, for example, Hox-al is 

FIG. 1 shows two different alleles of HoxAl from a case expressed in rhomboraer^ 4 through 8 of the developing 

of autism spectrum disorder. FIG. lA shows the previously hmdbrain on days 8 to 8.5 of gestation. The Hox genes 

published sequence of wild-type HoxAl. FIG. IB shows a pattern formation of the hindbrain Similar 

previously unknown polymorphism having a single base ,5 abnormalities have been observed in the brams ot autist^^^ 

substitution at position 218, where an A is changed to a G. individuals (Rodier et aL,7. Camp. Neura 370, 247 (1996), 

FIG. 2 shows a second polymorphism was identified in ^'^^^^ ^ ^^'^^y in^^^T^^^^^^d ^y reference), 

the first exon of HoxBl. The published sequence of wild- The DNA and amino acid sequences for HoxA-1 have 

type HoxBl (FIG. 2A) is compared to the previously previously been reported (Acampora,D.etal. ,M<c/e/cAc/ii5 

unknown polymorphism in this paralog of HoxAl (FIG. 30 Res,, 17, 10385 (1989); Hong, Y. et al.. Gene, 159, 209 

2B). In this case, the anomaly is a nine-base insertion that (1995) which are hereby incorporated by reference). Exon 1 

adds a third repeat where two are normally present. The stretches from base 1 to base 357. Exon 2 stretches from 

result is three extra amino acids, (serine-alanine-histidine). base 358 to the end (1008). 'llie wildtype gene sequences for 

For each of the polymorphisms, it was possible to test for the HoxAl is provided in SEQ. ID. No. 1 as follows: 



ATGGACAATG 


CAAGAATGAA 


CTCCTTCCTG 


GAATACCCCA 


TACTTAGCAG 


TGGCGACTCG 


60 


GGGACCTGCT 


CAGCCCGAGC 


CTACCCCTCG 


GACCATAGGA 


TTACAACTTT 


CCAGTCGTGC 


120 


GCGGTCAGCG 


CCAACAGTTG 


CGGCGGCGAC 


GACCGCTTCC 


TAGTGGGCAG 


GGGGGTGCAG 


180 


ATCGGTTCGC 


CCCACCACCA 


CCACCACCAC 


CAGCATCACC 


ACCCCCAGCC 


GGCTACCTAC 


240 


CAGACTTCCG GGAACCTGGG 


GGTGTCCTAC 


TCCCACTCAA 


GTTGTGGTCC 


AAGCTATGGC 


300 


TCACAGAACT 


TCAGTGCGCC 


TTACAGCCCC 


TACGCGTTAA 


ATCAGGAAGC 


AGACGTAAGT 


360 


GGTGGGTACC 


CCCAGTGCGC 


TCCCGCTGTT 


TACTCTGGAA 


ATCTCTCATC 


TCCCATGGTC 


420 


CAGCATCACC 


ACCACCACCA 


GGGTTATGCT 


GGGGGCGCGG 


TGGGCTCGCC 


TCAATACATT 


480 


CACCACTCAT ATGGACAGGA 


GCACCAGAGC 


CTGGCCCTGG 


CTACGTATAA 


TAACTCCTTG 


540 


TCCCCTCTCC ACGCCAGCCA CCAAGAAGCC 


TGTCGCTCCC 


CCGCATCGGA GACATCTTCT 


600 


CCAGCGCAGA 


CTTTTGACTG 


GATGAAAGTC 


AAAAGAAACC 


CTCCCAAAAC 


AGGGAAAGTT 


660 


GGAGAGTACG 


GCTACCTGGG 


TCAACCCAAC 


GCGGTGCGCA 


CCAACTTCAC 


TACCAAGCAG 


720 


CTCACGGAAC 


TGGAGAAGGA 


GTTCCACTTC 


AACAAGTACC 


TGACGCGCGC 


CCGCAGGGTG 


780 


GAGATCGCTG 


CATCCCTGCA 


GCTCAACGAG 


ACCCAAGTGA 


AGATCTGGTT 


CCAGAACCGC 


840 


CGAATGAAGC 


AAAAGAAACG 


TGAGAAGGAG 


GGTCTCTTGC 


CCATCTCTCC 


GGCCACCCCG 


900 


CCAGGAAACG 


ACGAGAAGGC 


CGAGGAATCC 


TCAGAGAAGT 


CCAGCTCTTC 


GCCCTGCGTT 


960 


CCTTCCCCGG 


GGTCTTCTAC 


CTCAGACACT 


CTGACTACCT 


CCCACTGA 




1008 



presence of the allele different from the known sequence by Jhe nucleic acid molecule of SEQ. ID. No. 1 encodes a 
digesting PGR product with a restriction enzyme (Hph-1 for polypeptide having the amino acid sequence of SEQ. ID. 
HoxAl and Msp-I for HoxBl). Sequencing reactions were No. 2, as follows: 



7 



US 6,228,582 Bl 



8 



M 


D 


N 


A 


R 


M 


N 


s 


F 


L 


E 


Y 


P 


I 


L 


15 


S 


S 


G 


D 


S 


G 


T 


c 


S 


A 


R 


A 


Y 


P 


S 


30 


D 


H 


R 


I 


T 


T 


F 


Q 


s 


C 


A 


V 


S 


A 


N 


45 


S 


C 


G 


G 


D 


D 


R 


F 


L 


V 


G 


R 


G 


V 


Q 


60 


I 


G 


S 


P 


H 


H 


H 


H 


H 


H 


H 


H 


H 


H 


P 


75 


Q 


P 


A 


T 


Y 


Q 


T 


S 


G 


N 


L 


G 


V 


S 


Y 


90 


s 


H 


vS 


S 


C 


G 


P 


S 


Y 


G 


S 


Q 


N 


F 


S 


105 


A 


P 


Y 


S 


P 


Y 


A 


L 


N 


Q 


E 


A 


D 


V 


s 


120 


G 


G 


Y 


P 


0 


C 


A 


P 


A 


V 


Y 


S 


G 


N 


L 


135 


S 


S 


p 


M 


V 


Q 


H 


H 


H 


H 


H 


Q 


G 


Y 


A 


150 


G 


G 


A 


V 


G 


S 


P 


Q 


Y 




H 


H 


S 


Y 


G 


165 


0 


E 


H 


0 


S 


L 


A 


L 


A 


T 


Y 


N 


N 


S 


L 


180 


s 


P 


L 


H 


A 


S 


H 


Q 


E 


A 


C 


R 


S 


P 


A 


195 


s 


E 


T 


S 


S 


P 


A 


Q 


T 


F 


D 


W 


M 


K 


V 


210 


K 


R 


N 


P 


P 


K 


T 


G 


K 


V 


G 


E 


Y 


G 


Y 


225 


L 


G 


Q 


P 


N 


A 


V 


R 


T 


N 


F 


T 


T 


K 


Q 


240 


L 


T 


E 


L 


E 


K 


E 


F 


II 


F 


N 


K 


Y 


L 


T 


255 


R 


A 


R 


R 


V 


E 


I 


A 


A 


S 


L 


Q 


L 


N 


E 


270 


T 


Q 


V 


K 


I 


W 


F 


Q 


N 


R 


R 


M 


K 


Q 


K 


285 


K 


R 


E 


K 


E 


G 


L 


L 


P 


I 


S 


P 


A 


T 


P 


300 


P 


G 


N 


D 


E 


K 


A 


E 


E 


S 


S 


E 


K 


S 


S 


315 


S 


S 


P 


C 


V 


P 


S 


P 


G 


s 


s 


T 


S 


D 


T 


330 


L 


T 


T 


s 


M 






















335 



A polymorphism in the I IocAl gene has been isolated and 
sequenced. This polymorphism is associated with autisni 
spectrum disorders. A single base substitution is located at 
position 218 (underlined) of SEQ. ID. No. 3, where an A is 
changed to a G, as follows: 



The single base substitution at position 218 results in the 
replacement of histidine with arginine (underlined). The 
resulting protein has the amino acid sequence (SEQ. ID. No. 
4) as follows: 



ATGGACAATG CAAGAATGAA CTCCTTCCTG GAATACCCCA TACTTAGCAG TGGCGACTCG 60 

GGGACCTGCT CAGCCCGAGC CTACCCCTCG GACCATAGGA TTACAACTTT CCAGTCGTGC 120 

GCGGTCAGCG CCAACAGTTG CGGCGGCGAC GACCGCTTCC TAGTGGGCAG GGGGGTGCAG 180 

ATCGGTTCGC CCCACCACCA CCACCACCAC CACCATCGCC ACCCCCAGCC GGCTACCTAC 240 

CAGACTTCCG GGAACCTGGG GGTGTCCTAC TCCCACTCAA GTTGTGGTCC AAGCTATGGC 300 

TCACAGAACT TCAGTGCGCC TTACAGCCCC TACGCGTTAA ATCAGGAAGC AGACGTAAGT 360 

GGTGGGTACC CCCAGTGCGC TCCCGCTGTT TACTCTGGAA ATCTCTCATC TCCCATGGTC 420 

CAGCATCACC ACCACCACCA GGGTTATGCT GGGGGCGCGG TGGGCTCGCC TCAATACATT 480 

CACCACTCAT ATGGACAGGA GCACCAGAGC CTGGCCCTGG CTACGTATAA TAACTCCTTG 540 

TCCCCTCTCC ACGCCAGCCA CCAAGAAGCC TGTCGCTCCC CCGCATCGGA GACATCTTCT 600 

CCAGCGCAGA CTTTTGACTG GATGAAAGTC AAAAGAAACC CTCCCAAAAC AGGGAAAGTT 660 

GGAGAGTACG GCTACCTGGG TCAACCCAAC GCGGTGCGCA CCAACTTCAC TACCAAGCAG 720 

CTCACGGAAC TGGAGAAGGA GTTCCACTTC AACAAGTACC TGACGCGCGC CCGCAGGGTG 780 

GAGATCGCTG CATCCCTGCA GCTCAACGAG ACCCAAGTGA AGATCTGGTT CCAGAACCGC 840 

CGAATGAAGC AAAAGAAACG TGAGAAGGAG GGTCTCTTGC CCATCTCTCC GGCCACCCCG 900 

CCAGGAAACG ACGAGAAGGC CGAGGAATCC TCA6AGAAGT CCAGCTCTTC GCCCTGCGTT 960 

CCTTCCCCGG GGTCTTCTAC CTCAGACACT CTGACTACCT CCCACTGA 1008 

60 
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In addition to the polymorphism in HoxAl, a polymor- 
phism associated with autism spectrum disorders has been 
isolated and sequenced iVom the HoxBl gene. Ilie Hoxbl 
gene has not been studied as comprehensively as Hoxal in 
transoenic knockouts, but is expressed at the same stage 
(Murphy, P et al. Development, 111, 61 (1991), which is 
hereby incorporated by reference). Its null mutation pro- 
duces similar malformations, including severe diminution of 
the facial nucleus (Goddard, J. M. et al.. Development, 122, 
3217 (1996), which is hereby incorporated by reference). 



The similarity of expression and function of these two genes 
is due to the fact that they were originally a single gene in 
invertebrales (Ruddle, K H. el dX.^Annu. Re\K Genet., 28, 423 
(1993), which is hereby incorporated by reference). In 
mammals, the two appear on separate chromosomes (human 
7 and 17), but the sequence of each of the mammalian genes 
is similar to the others, and similar to the original single gene 
from which the two mammalian loci arose, llie sequence of 
the wildtype hoxBl gene (SEQ. ID. No. 5) follows: 



TGACGCATGG ACTATAATAG GATGAACTCC 
CCCiWSCGCCT ACAGCGCCCA CAGCGCCCCA 
GTTGACAGCT ATGCAAGCGA GGGCCGCTAC 
CAGAACTCCG GCTATCCCGC CCAGCAGCCG 
TCCGCGCCCT CGGGGTATGC TCCTGCCGCC 
TACCCTCTGG GTCAATCAGA AGGAGACGGA 
CAGCTAGGGG GCTTGTCCGA TGGCTACGGA 
CCGCAGCATC CCCCTTATGG GAACGAGCAG 
CTCCTCTCCG AGGACAAGGA AACACCCTGC 
ACCTTCGACT GGATGAAGGT TAAGAGAAAC 
GGCCTGGGCT CGCCCAGTGG CCTCCGCACC 
GAAAAGGAGT TCCATTTCAA CAAGTACCTG 
ACCCTGGAGC TCAATGAAAC ACAGGTCAAG 
AAGAAGCGCG AGCGAGAGGG AGGTCGGGTC 
GCAGCTGGAG ATGCCTCAGA CCAGTCGACA 
GTCACCTCCT GAACTGAACC TAGCCACCAA 
CCAGCCCTAT CCCAGGCTCT CCCAACCCAG 
T 



TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

ACCTCCTTTC CCCCAAGCTC GGCTCAGGCG 120 

66TGGGGGGC TGTCCAGCCC TGCGTTTCAG 180 

CCTTCGACCC TGGGGGTGCC CTTCCCCAGC 240 

TGCAGCCCCA GCTACGGGCC TTCTCAGTAC 300 

GGCTATTTTC ATCCCTCGAG CTACGGGGCC 360 

GCAGGTGGAG CCGGTCCGGG GCCATATCCT 420 

ACCGCGAGCT TTGCACCGGC CTATGCTGAT 480 

CCTTCAGAAC CTAACACCCC CACGGCCCGG 540 

CCACCCAAGA CAGCGAAGGT GTCAGAGCCA 600 

AACTTCACCA CAAGGCAGCT GACAGAACTG 660 

AGCCGGGCCC GGAGGGTGGA GATTGCCGCC 720 

ATTTGGTTCC AGAACCGACG AATGAAGCAG 780 

CCCCCAGCCC CACCAGGCTG CCCCAAGGAG 840 

TGCACCTCCC CGGAAGCCTC ACCCAGCTCT 900 

TGGGGCTTCC AGGCACTGGA GCGCCCCAGT 960 

GCCTGGCTTC ACTGCCTGGG ATCTCTAGGC 1020 

1021 
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The protein encoded by nucleotides 7 to 909 of the insertion is such that the amino acid sequence also changes, 
wild-type HoxBi gene (SEQ. ID. No. 6) is as follows: The normal sequence reads . . . serine-alanine-histidine- 
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As with the HoxAl gene, polymorphisms associated with 
autism spectrum disorders were found with HoxBl. The 
HoxBl mutation occurs after base 88 (C) with the insertion 
of nine nucleotides (ACAGCGCCC). The location of this 



25 



scrine-alaninc-proline. The mutant sequence has an extra 
serine-alanine-histidine-sequence and then the sequence 
resumes normally. The insertion and altered amino acid 
sequence are underlined below. A mutated form of HoxBl 
(SEQ. ID. No. 7) is depicted as follows: 



TGACGCATGG ACTATAATAG 


GATGAACTCC 


TTCTTAGAGT 


ACCCACTCTG 


TAACCGGGGA 


60 


CCCAGCGCCT 


ACAGCGCCCA 


CAGCGCCCAC AGCGCCCCAA 


CCTCCTTTCC 


CCCAAGCTCG 


120 


GCTCAGGCGG 


TTGACAGCTA 


TGCAAGCGAG 


GGCCGCTACG 


GTGGGGGGCT 


GTCCAGCCCT 


180 


GCGTTTCAGC 


AGAACTCCGG 


CTATCCCGCC 


CAGCAGCCGC 


CTTCGACCCT 


GGGGGTGCCC 


240 


TTCCCCAGCT 


CCGCGCCCTC 


GGGGTATGCT 


CCTGCCGCCT 


GCAGCCCCAG 


CTACGGGCCT 


300 


TCTCAGTACT 


ACCCTCTGGG 


TCAATCAGAA 


GGAGACGGAG 


GCTATTTTCA 


TCCCTCGAGC 


360 


TACGGGGCCC 


AGCTAGGGGG 


CTTGTCCGAT 


GGCTACGGAG 


CAGGTGGAGC 


CGGTCCGGGG 


420 


CCATATCCTC 


CGCAGCATCC 


CCCTTATGGG 


AACGAGCAGA 


CCGCGAGCTT 


TGCACCGGCC 


480 


TATGCTGATC 


TCCTCTCCGA 


GGACAAGGAA 


ACACCCTGCC 


CTTCAGAACC 


TAACACCCCC 


540 


ACGGCCCGGA CCTTCGACTG 


GATGAAGGTT 


AAGAGAAACC 


CACCCAAGAC 


AGCGAAGGTG 


600 


TCAGAGCCAG 


GCCTGGGCTC 


GCCCAGTGGC 


CTCCGCACCA 


ACTTCACCAC 


AAGGCAGCTG 


660 


ACAGAACTGG 


AAAAGGAGTT 


CCATTTCAAC 


AAGTACCTGA 


GCCGGGCCCG 


GAGGGTGGAG 


720 


ATTGCCGCCA 


CCCTGGAGCT 


CAATGAAACA 


CAGGTCAAGA 


TTTGGTTCCA 


GAACCGACGA 


780 


ATGAAGCAGA 


AGAAGCGCGA 


GCGAGAGGGA 


GGTCGGGTCC 


CCCCAGCCCC 


ACCAGGCTGC 


840 


CCCAAGGAGG 


CAGCTGGAGA 


TGCCTCAGAC 


CAGTCGACAT 


GCACCTCCCC 


GGAAGCCTCA 


900 


CCCAGCTCTG 


TCACCTCCTG 


AACTGAACCT 


AGCCACCAAT 


GGGGCTTCCA 


GGCACTG6AG 


960 


CGCCCCAGTC 


CAGCCCTATC 


CCAGGCTCTC 


CCAACCCAGG 


CCTGGCTTCA 


CTGCCTG6GA 


1020 


TCTCTAGGCT 
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The protein encoded by SEQ. ID. No. 8 is as follows: 
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Genes which have been duplicated and then maintained 
similar functions over the course of evolution are called 
"paralogs." A third paralog derived from the same inverte- 
brate gene is known as lioxDl. This gene has not yet been 
studied in knockouts, but is known to have evolved to be 
expressed in somewhat different embryonic tissues 
(mesoderm vs. ectoderm) in the hindbrain region at the same 
stage of development as Hoxal and Hoxbl. llius preferred 
hox genes include HoxAl, HoxBl, and HoxDl. 

Biological samples suitable for testing include blood, 
saliva, amniotic fluid, and tissue. The most preferred bio- 
logical sample is blood. However, any biological sample 
from which genetic material or the products of the marker 
genes can be isolated is suitable. 

Because the Hox genes arc highly conserved among 
species, the present invention is applicable for screening for 
autism related polymorphisms in mammals. The screening 
method can be utilized to identify animals carrying defects 
in genes like those which give rise to autism in humans in 
order to study the progression of the disease and test 
treatments. However, the preferred mammal to be screened 
is humans. In particular, the biological samples are isolated 
from developmenlally disabled children or adults in order to 
determine whether they carry the marker associated with 
autism to assist in diagnosing the disease. Similarly, the 
parents or relatives of disabled children may be screened to 
determine whether they are carriers of the mutated gene. 
Samples may also be tested from children including infants 
to identify those children who have genetic markers asso- 
ciated with autism in order to provide them with early 
behavior training. 

As discussed more fiiUy in the examples, polymorphisms 
in the HoxAl gene are associated with autism spectrum 
disorders. In addition to HoxAl, the HoxBl and HoxDl 
genes are also involved in the same stages of early brain 
development. Hoxbl and Hoxdl are related developmental 
genes which are expressed at the same time and in approxi- 
mately the same region of the embryo as Hoxal. The Hox 
genes are closely related and may perform similar functions 
in development. Evolutionarily the various Hox genes were 
probably derived from a common ancestral gene. Thus, the 
preferred genes to be screened include Hoxal, Hoxbl, and 
Hoxdl. 



The mutation in the mutated gene may be a single base 
substitution mutation resulting in an amino acid substitution, 
a single base substitution mutation resulting in a transla- 
tional stop, an insertion mutation, a deletion mutation, or a 
gene rearrangement. As demonstrated from the identified 
polymorphisms in HoxAl and HoxBl, polymorphisms 
which disrupt the gene or result in an altered peptide are 
associated with autism spectrum disorders. 

The mutation may be located in an intron, an exon of the 
gene, or a promoter or other regulatory region which affects 
the expression of the gene. 

Methods for screening for mutated nucleic acids include 
direct sequencing of nucleic acids, single strand polymor- 
phism assay, ligase chain reaction, enzymatic cleavage, and 
southern hybridization. 

40 Screening for mutated nucleic acids can be accompHshed 
by direct sequencing of nucleic acids. In fact, putative 
mutants identified by other methods may be sequenced to 
determine the exact nature of the mutation. Nucleic acid 
sequences can be determined through a number of dilTerenl 

45 techniques which are well known to those skilled in the art. 
In order to sequence the nucleic acid, sufficient copies of the 
material must first be amplified. 

Amplification of a selected, or target, nucleic acid 
sequence may be carried out by any suitable means. (See 

50 generally Kwoh, D. and Kwoh, T.^Am Biotechnol Lab, 8, 14 
(1990), which is hereby incorporated by reference.) 
Examples of suitable amplification techniques include, but 
are not limited to, polymerase chain reaction, ligase chain 
reaction (see Barany, Proc Natl Acad Sci USA 88, 189 

55 (1991), which is hereby incorporated by reference), strand 
displacement amplification (see generally Walker, G. et al., 
Nucleic Acids Res. 20, 1691 (1992); Walker. G. et aUProc 
Natl Acad Sci USA 89, 392 (1992), which are hereby 
incorporated by reference), transcription-based amplifica- 

60 tion (see Kwoh, D. et al., Pmc Natl Acad Sci USA , 86, 1 173 
(1989), which is hereby incorporated by reference), selT- 
susiained sequence rephcation (or "3SR") (see Guatelli, J. et 
al, Proc Natl Acad Sci USA , 87, 1874 (1990), which is 
hereby incorporated by reference), the Qp replicase system 

65 (see Lizardi, P. et al.. Biotechnology, 6, 1197 (1988), which 
is hereby incorporated by reference), nucleic acid sequence- 
based amplification (or "NASBA") (see Lewis, R., Genetic 
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Engineering News, 12(9), 1 (1992), which is hereby incor- Southern hybridization is also an effective method of 

porated by reference), the repair chain reaction (or "RCR") identifying differences in sequences. Hybridization 

(see Lewis, R., Genetic Engineering News, 12(9), 1 (1992), conditions, such as salt concentration and temperature can 

which is hereby incorporated by reference), and boomerang he adjusted for the sequence to be screened. Southern 

DNA amplification (or "BDA") (see Lewis, R., Genetic 5 blotting and hybridizations protocols are described in C«r- 

Engineering News, 12(9), 1 (1992), which is hereby incor- ''^"^ Protocols in Molecular Biology (Greene Publishing 

porated by reference). Polymerase chain reaction Ls cur- Associates and Wiley-Intei^ience), pages 2.91-2.9.10. 

rentlv oreferred Probes can be labelled for hybndizatton with random oh- 

, S^^, , . 1 . , . gomers (primarily 9-mers) and the Klenow fragment of 

In general, DNA amplification techniques such as the DNA polymerase. Very high specific activity probe can be 

foregoing mvolve the use of a probe, a pair of probes or two ^^^^.^^^ ^^^^ commercially available kits such as the 

pairs of probes which specifically bind to DNA encoding the Ready-To-Go DNA Ubelling Beads (Pharmacia Biotech), 

gene of interest, but do not bind to DNA which does not following the manufacturer's protocol. Briefly, 25 ng of 

encode the gene, under the same hybridization conditions, j^^^ ^ ^^^^^ j^j^ji^ ^^^j^ ^^V^CY? in a 15 minute 

and which serve as the pruner or pnmers for the amplifica- incubation at 37° C. Labelled probe is then purified over a 

tion of the gene of interest or a portion thereof m the 15 ^hromaSpin (Clontech) nucleic acid purification column, 

amplification reaction. Possible competition of probes having high repeat sequence 

Nucleic acid sequencing can be performed by chemical or content, and stringency of hybridization and washdown will 

enzymatic methods. The enzymatic method relies on the be determined individually for each probe used, 

ability of DNA polymerase to extend a primer, hybridized to Alternatively, fragments of a candidate gene may be gener- 

the template to be sequenced, until a chain-terminating ated by PCR, the specificity may be verified using a rodent- 

nucleotide is incorporated. The most common methods human somatic cell hybrid panel, and subcloning the frag- 

utilize didoexynucleotides. Primers may be labelled with ment. This allows for a large prep for sequencing and use as 

radioactive or fluorescent labels. Various DNA polymerases a probe. Once a given gene fragment has been characterized, 

are available including Klenow fragment, AM V reverse small probe preps can be done by gel- or column-purifying 

transcriptase, Thermus aquaticus DNA polymerase, and " pcR product. 

modified T7 polymerase. 'Iliese mismatch detection protocols use samples gener- 
Although DNA sequencing is clearly the most sensitive ated by PCR and thus require use of very little genomic 
and informative method, it is too cumbersome for routine template. All of these methods can provide very good clues 
use in searching for polymorphisms, especially when the regarding the location ofthe sequence change which leads to 
DNA segment of interest is large. Several other methods are the appearance of anomalous bands, hence facilitating sub- 
available for a rapid search for changes in autism associated sequent cloning and sequencing strategies. 
g*5ncs. Methods of screening lor mutated nucleic acids can be 
Recently, single strand polymorphism assay ("SSPA") carried out asing either deoxyribonucleic acids ("DNA") or 
analysis and the closely related heteroduplex analysis meth- 35 messenger ribonucleic acids ("mRNA") isolated from the 
ods have come into use as effective methods for screening biological sample. During periods when the gene is 
for single-base polymorphisms (Orita, M. et aL, Proc Natl expressed, mRNA may be abundant and more readily 
Acad Sci USA, 86, 2766 (1989), which is hereby incorpo- detected. However, these genes are temporally controlled 
rated by reference). In these methods, the mobility of and, at most stages of development, the preferred material 
PCR-amplified test DNA from clinical specimens is com- 40 for screening is DNA. 

pared with the mobility of DNA amplified from normal Alternatively, the detection of a mutated gene associated 

sources by direct electrophoresis of samples in adjacent with autism can be carried out by collecting a biological 

lanes of native polyacrylamide or other types of matrix gels. sample and testing for the presence or form of the protein 

Single-base changes often alter the secondary structure of produced by the gene. The mutation in the gene may result 

the molecule sufficiently to cause slight mobility differences 45 in the production of a mutated form of the peptide or the lack 

between the normal and mutant PCR products after pro- of production of the gene product. In this embodiment, the 

longed electrophoresis. determination of the presence of the polymorphic form of 

Ligase chain reaction is yet another recently developed the protein can be carried out, for example, by isoelectric 

methodof screening for mutated nucleic acids. Ligase chain focusing, protein sizing, or immunoassay. In an 

reaction (LCR) is also carried out in accordance with known 50 immunoassay, an antibody that selectively binds to the 

techniques. LCR is especially useful to amplify, and thereby mutated protein can be utilized (for example, an antibody 

detect, single nucleotide differences between two DNA that selectively binds to the mutated form of HoxAl 

samples. In general, the reaction is called out with two pairs encoded protein). Such methods for i.soelectric focusing and 

of oligonucleotide probes: one pair binds to one strand of the immunoassay are well known in the art, and are discus.sed in 

sequence to be detected; the other pair binds to the other 55 further detail below. 

strand of the sequence to be detected. The reaction is carried Changes in the size or charge of the polypeptide can be 
out by, first, denaturing (e.g., separating) the strands of the identified by isoelectric focusing or protein sizing tech- 
sequence to be detected, then reacting the strands with the niques. Changes resulting in amino acid substitutions, where 
twopairsof oligonucleotide probes in the presence of a heat the substituted amino acid has a different charge than the 
stable ligase so that each pair of oligonucleotide probes 60 original amino acid, can be detected by isoelectric focusing, 
hybridi/e to target DNA and, if there is peri"ecl complemen- Isoelectric focusing of the polypeptide through a gel having 
tarity at their junction, adjacent probes are ligated together. an ampholine gradient at high voltages separates proteins by 
The hybridized molecules are then separated under denalur- their pi. The pH gradient gel can be compared to a simul- 
ation conditions. The process is cyclically repeated until the taneously run gel containing the wild-type protein. Protein 
sequence has been amplified to the desired degree. Detection 65 sizing techniques such as protein electrophoresis and sizing 
may then be carried out in a manner hke that described chromatography can also be used to detect changes in the 
above with respect to PCR. size of the product. 
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As an aiteraative to isoelectric focusing or protein sizing, Those skilled in the art will be familiar with numerous 

the step of determining the presence of the mutated polypep- specific immimoassay formats and variations thereof which 

tides in a sample may be carried out by an antibody assay may be useful for carrying out the method disclosed herein, 

with an antibody which selectively binds to the mutated See U.S. Pat. Nos. 4,727,022, 4,659,678, 4,376,110, 4,275, 

polypeptides (i.e., an antibody which binds to the mutated 5 149 4 233 402 and 4 230 767 

polypeptides but exhibits essentiaUy no binding to the AnJhcKl.'es which selectively bind a polymorphic D1.ST 

Wild-type polypeptide without the polymorphism in the ... . * j * ij * •* ui ^ 

same binding conditions). isotorm may be conjugated to a solid support su. able tor a 

A J- J * J 1 1 *t. J * c .L. diagnostic assay (e.g., beads, plates, slides or wells rormed 

Antibodies used to bind selectively the products of the r * • 1 u 1 * 1 . \ • 1 

. A , . -fui.L- from matenals such as latex or polystyrene) in accordance 

mutated genes can be produced by any suitable technique. , * u • u • ■* / a * u 

r. ®, 1 1 J- 1 J J • -^^ With known techniques, such as precipitation. Antibodies 

For example, monoclonal antibodies may be produced m a u-luj 1 u F-.¥OT.*r 11 u 

I. i_ • I 111- 1- . .1- . i_ • r ui which bind a polymorphic DLST isoform may likewise be 

hybridoma ceU line accordmg to the techniques of Kohler ♦ ^ * ^ . . iTi t. j- 1 u 1 / 

^ , . , /imcx u- u ' u u conjugated to detectable groups such as radiolabels (e.g., 

and Milstein, Nature, 265, 495 (1975), which is hereby 350 itsi i3U\ 1 tf 1 ; u j- u 

, ' ' A u u J • * 1- J S, I, I), enzyme labels (e.g., horseradish peroxidase, 

incorporated by reference. A nybndoma IS an immortalized ,1 i. . . \ 1 if . 1 ^ 1 / 

„ , . , . . , c 11 alkaline phosphatase), and fluorescent labels (e.g., 

cell line which IS capable or secreting a specific monoclonal « • \ - j -.1.1 . u • v 

•i_ J , / , J . f u u • ^ fluorescein) in accordance with known techniques, 
antibody. The mutated products of genes which are associ- . - . . . , 
ated %vith autism may be obtained from a human patient, invention further provides an isolated nucleic acid 
purified, and used as the immunogen for the production of molecule which encodes a HoxAl gene having a single base 
monoclonal or polyclonal antibodies. Purified polypeptides substitution at nucleotide 218 in SEQ. ID. No. 1. In another 
may be produced by recombinant means to express a bio- embodiment, the invention provides an isolated nucleic acid 
logically active isoform, or even an immunogenic fragment molecule which encodes a HoxBl gene havmg an insertion 
thereof may be used as an immunogen. Monoclonal Fab ^^'ween positions nucleotides 88 and 89 in SEQ. ID. No. 5. 
fragments may be produced in Escherichia coli from the ^^^^^^on, the invention provides fragments of the HoxAl 
known sequences by recombinant techniques known to ^"^ ""^^^ ^^"^'^ having the polymorphism, where the 
those skilled in the art. (See, e.g., Huse, W., Science 246, tragment has at least 15 nucleotides and encompasses the 
1275 (1989), which is hereby incorporated by reference) " polymorphism, i.e., the single base substitution. Fragments 
(recombinant Fab techniques). nucleotides can be used to probe for nucleic 
The term "antibodies" as used herein refers to all types of ^^^^ r^o\^^\^s containing the polymorphism^ Longer frag- 
immunoglobulin, including IgG, IgM, IgA, IgD, and IgE. ""^^ "^^^ ^^^^^^ stnngency conditions. 
The antibodies may be monoclonal or polyclonal and may 30 invention also provides isolated polypeptides that are 
be of any species of origin, including (for example) mouse, encoded by the genes having the polymorphisms. Either the 
rat, rabbit, horse, or human, or may be chimeric antibodies, wt'ol^ P^-^tein or fragments thereof may be used to induce 
and include antibody fragments such as, for example, Fab, ^^e production of antibodies specific to the portion of the 
F(ab')2. and Fv fragments, and the corresponding fragments P^o^^in ^^ich is effected by the polymorphism. Such anti- 
obtained from antibodies other than IgG. 35 ^^^^^ may then be used to detect the presence of a poly- 
Antibody assays may, in general, be homogeneous assays " morphism. Preferred antibodies bind specifically to the 
or heterogeneous. In a homogeneous assay the immunologi- P'f^J^^I'^ polypeptide effected by the polymorphism but 
cal reaction usually involves the specific antibody, a labeled ^^"^^y wild-type Hox pmtem. 
analyte, and the sample of interest. The signal arising from 1° embodiment, the antibody is a monoclonal anti- 
the label is modified, directly or indirectly, upon the binding 40 ^^y- ^ ^ immunoassay, the antibody can be bound 
of the antibody to the labeled analyte. Both the iramuno- * solid support or bound to a detectable label. 

logical reaction and detection of the extent thereof are rvAnxnrr'c? 

^ . , . , , ■ T ... EXAMPLES 
earned out in a homogeneous solution. Immunochemical 

labels which may be employed include free radicals. Example 1 

radioisotopes, fluorescent dyes, enzymes, bacteriophages, 45 Collection of Blood Samples from Autistic Individuals 

coenzymes, and so forth. Blood was collected from patients with autism and their 

In a heterogeneous assay approach, the reagents are immediate family members in order to determine whether 

usually the specimen, the antiixidy of the invention and any polymorphisms in HoxAl are present among this popu- 

means for producing a detectable signal. Similar specimens lation. All blood samples were procured following written 

as described above may be used. The antibody is generally 50 consent by the patients or their guardians. Among the 

immobilized on a support, such as a bead, plate, or slide, and samples collected were those of the members of a family of 

contacted with the specimen suspected of containing the four in which one child has autism and the other has 

antigen in a liquid phase. The support is then separated from Asperger's syndrome; both children have malformed ears, 

the liquid phase and either the support phase or the liquid The first son is retarded and the second has normal intelli- 

phase is examined for a detectable signal employing means 55 gence. The parents have no obvious symptoms. DNA was 

for producing such signal. The signal is related to the extracted from the blood by phenolchloroform extraction 

presence of the analyte in the specimen. Means for produc- following isolation and lysis of the white blood cells, 

ing a detectable signal include the use of radioactive labels. Control DNA was also used for these excrements; this DNA 

fluorescent labels, enzyme labels, and so forth. For example, was obtained from neurologically normal donors, 

if the antigen to be detected contains a second binding site, 60 The 20 cc blood samples were left for three-four days at 

an antibody which binds to that site can be conjugated to a room temperature to allow continued proliferation of white 

detectable group and added to the liquid phase reaction blood cells. White cells were pelleted, followed by isolation 

solution before the separation .step. The presence of the of the nuclei. The nuclei were then incubated overnight at 

detectable group on the solid support indicates the presence 37° C. in a lysis buffer consisting of EDTA, TNE-SDS, and 

of the antigen in the test sample. Examples of suitable 65 proteinase K. Protein contaminants were extracted by addi- 

immuno assays arc the radioimmunoassay, immunofluores- tions of buffered phenol followed by chloroform, then DNA 

cence methods, enzyme-linked immunoassays, and the hke. was precipitated by the addition of ice-cold ethanol. The 
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DNA was resuspended in TE buffer for storage at 4** C. 
Extraction of genomic DNA from fixed tissue was carried 
out using the protocol of Volkenandt et al.. Methods in 
Molecular Biology, 15, 81, Humana Press, (1993), which is 
hereby incorporated by reference). 



Example 2 
Sequencing the Hoxal Gene 

The HoxAl gene was amplified by PCR from DNA 
samples to provide sufiBcient material for sequencing. Two 
sets of oligonucleotide primers were selected after exami- 
nation of the human HoxAl nucleic acid sequence and 
comparison of the sequence to those of human and mouse 
Hox genes. The first set was designed to amplify residues 
10-647, the second to amplify from residue 656 to the stop 
codon at residue 1008, exons 1 and 2 of HoxAl, respec- 
tively. The primers were used in polymerase chain reaction 
to amplify the target gene in several control blood samples, 
in order to determine the appropriate PCR conditions. Both 
exons were amplified by 94** C. denaturation for 1 min, 62** 
C, annealing for 30 sec, and 72° C. extension for 2 rain, for 
35 cycles. The products were visualized with ethidium 
bromide staining on a 1-2% agarose gel. PhiX174 RF 
DNA/Hae III fragments (Gibco) were used as a molecular 
weight marker. The products were tested for chromosome 
origin by using human-rodent monochromosomal somatic 
cell hybrids. Both exons amplified by the HoxAl primers 
amplified the hybrid containing human chromosome 7 and 
do not amplify from any other hybrids. Establishing that the 
product amplified by the primers is from the correct chro- 
mosome rules out the possibiUty that pseudogenes with the 
same sequence occur at other sites or that the amplified 
product is another homologous homeobox gene. It verifies 
that the PCR product represents only the targeted gene. 

The polymerase chain reaction (PCR) was performed with 
various samples of control DNA in order to determine the 
appropriate conditions. Once the optimal conditions were 
ascertained, the gene was amplified from the patient 
samples. 

Following PCR, an aliquot of the product was used for 
DNA sequencing using the Sequenase system version 2.0 
(United States Biochemical), which is a chain-tennination 
method of DNA sequencing. The following procedure was 
used to read the nucleic acid sequence of the amplified 
products. 7 /il of PCR product was mixed with 2 /d shrimp 
alkaline phosphatase and 0.5 /d exonuclease I. The mixture 
was incnibated al 37** C. for 15 min and then at 80** C. for 15 
min. After addition of 1 /d of primer, the mixture was 
incubated at 100° C. for 3 min and then chilled on ice for 5 
min. Next, the sample was incubated for 5 min at room 
temperature with the following additions: 2 /d 5x buffer, 1 
Jill DTT, 2 /d diluted dGTP, 0.5 /^P^S-dATP, and 2/d diluted 
Sequenase buffer. A 3.5 //I aliquot of the mixture was then 
added to 1 /<1 of one dideoxyNTP. After 5 min at 37* C, 4 
/d of stop solution was added to the tube. The products were 
run on a 6% polyacrylamide sequencing gel for 2-4 hr. 
Following this, the gel was dried on a BioRad gel dryer and 
exposed to film overnight. Film was developed on a Kodak 
M35A X-OMAF Processor. ITie method has been used 
successfully to duplicate the published sequence of the 
Hoxal exons in samples From a number of controls. The film 
was developed the next afternoon, and the DNA sequence 
was read manually for comparison to the published HoxAl 
sequence. 

The nucleotide sequence from some patients, including 
the members of the family mentioned previously, showed 
the presence of two discrete bands at the same levels on the 
gel. 
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Example 3 
Sequencing the PCR Products 

Since sequencing PCR products allows the DNA 
sequence to be read from both alleles, a sequence with 
double bands suggests heterozygosity — that the two alleles 
are not the same and that two dilTerent sequences superim- 
pcxsed on one another are being read. Based on the.se results, 
the PCR products were cloned in order to get a cleaner 
sequence. Cloning separates the two alleles and allowed 
each to be individually sequenced to determine whether one 
or both alleles arc abnormal. 

The PCR products were cloned using Invitrogcn's Zero 
Blunt PCR Cloning Kit. This kit is designed to clone 
blunt-ended PCR fragments, which can be generated by 
using a thermostable DNA polymerase with proofreading 
activity. Once the products were cloned, the clonal DNA was 
sequenced using the Sequenase version 2.0 chain- 
termination sequencing system. Each clone was sequenced 
in both 5' and 3' directions, and the reactions were run out 
for 6 hours on a 6% polyacrylamide sequencing gel. 

Cloning allowed the determination that three out of four 
members of this family are indeed heterozygous for Hox A 
1. The father and both children contain an identical mutation 
in the gene: a single base-pair change of A to G in the first 
cxon of the gene; the mother's gene is normal. This mutation 
is dominant with variable penetrance. Sequences showing 
the mutation can be seen in FIG. 1. FIG. lA shows the 
wild-type sequence. Substitution of guanine for adenine at 
this single location as shown in FIG. IB causes an alteration 
in the resulting amino acid sequence, changing a histidine to 
an arginine. 
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Example 4 
Restriction Analysis of PCR Products 

The PCR products from this family were also subjected to 
restriction enzyme digestion to confirm the mutation. The 
enzyme Hph I recognizes the specific sequence 3' . . . 
CCACr(N7) ... 5'. When normal HoxAl is digested with 
this enzyme, it will be cut; however, when mutated HoxAl 
is digested, it will not be cut, because the recognition site has 
been changed by the mutation. This enzyme has been used 
to digest PCR products from this family and confirm that the 
mutation does indeed exist in the father and the children but 
not in the mother. This enzyme has been used to digest PCR 
products from approximately 100 controls, 36 parent pairs, 
26 affected relatives, and 46 probands. In forty cases, the 
results of the restriction analysis has been compared to that 
from the sequencing reactions. The two methods gave 
identical results in every case. 

Example 5 
Sequencing of a Polymorphism in HoxBl 

The sequence for the HoxBl gene (accession number 
X16666) was obtained from the Entrez data base. From this 
sequence primers for the amplification of a 575 bp product 
of exon 1 by PCR were designed (Sense: 
5'.GCATGGACTATAATAGGATG-3' (SEQ. ID. No. 9); 
Antisense: 5'-TCTrGGGTGGGTTTCTCTrA-3' (SEQ. ID. 
No. 10)). llie final concentration of the following compo- 
nents were used in the amplification reaction: 1.5 U Taq 
polymerase; 200 /^M each of dATP, dCTP, dGTP, dTTP; 1.5 
mM MgCl; 0.4 mM of each sense and antisense primer; 
50-100 ng DNA template; and distilled H^O to a final 
volume of 25 jtd. The Taq, dNTPs and MgCl are supplied in 
a Ready-To-Go PCR Bead (Pharmacia 27-9555-01) and 
were used according to manufacturer's directions. The PCR 
reaction was carried out in a Perkin-Ehner 480 Gene Amp or 
a Perkin-Elmer 2400 thermocycler. Reaction conditions 
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were: denaturing for 1 minute at 94* C, and then 35 cycles 
of denaturing at 94* C. for 45 sec, annealing at 5TC, for 45 
sec, and elongation at 72° C. for 45 sec. Resulting PGR 
product was analyzed on a 1% agarose gel and compare to 
a 100 bp ladder to determine the size of the product. Since 5 
the size of the product was as expected (575 bp) and somatic 
cell hybrid results indicated that the product is specific for 
chromosome 17 DNA samples from probands, family mem- 
bers and controls were amplified and sequenced using a 
radiolabeled terminator cycle sequencing kit (Amersham 
Life Science US79750). The sequencing reaction was ran on 
a 6% acrylamide sequencing gel (National Diagnostics) and 
exposed to Kodak Biomax MS X ray film for 24-48 hours. 
After developing the film, the resulting sequence was com- 
pared to the published sequence found in the Entrez data 
base. 

Example 6 

Association of the Newly-discovered Alleles with Autism 
Spectrum Disorders 

Forty-six probands with autistic spectrum disorders and 20 
evidence of genetic causation were selected for analysis. 
Forty-three had one or more other affected family members 
and thirty-five had ear anomalies or neurological deficits 
consistent with malfunction of HoxAl or its paralogy. For 
comparison, three other groups were tested: 25 

1) An unstructured control group consisting of adults with 
no evidence of neurological abnormality collected from 
many different medical centers. These were mostly spouses 
of patients with late onset degenerative diseases of the 
nervous system. The purpose of this group was to determine ^0 
the frequency of the alleles in the general population. 

2) Parent controls — While each of the parents of a 
proband obviously transmits half of his or her genetic 
material to the proband, imaginary individuals with two 
alleles constructed from the untransmitled allele of each 
parent pair should give an accurate estimate of the frequency 
of the alleles in the study population, aside from those 
transmitted to the probands. Thus, the untransmitted alleles 
of the parent pair make a more stringent control, taking into 
account known and unknown structure in the local popula- 
tion. 

3) Affected family members of probands — ^When they 
were available, the siblings, cousins, parents, or aunts and 
uncles of probands diagnosed with autism spectrum disor- 
ders or related symptoms (e.g. learning disabilities, language 45 
delays, neurological anomalies of the cranial nerves) were 
tested. If an allele is associated with autism, it should be 
more frequent in probands and affected family members 
than in historic or parent cx)ntrols. 

50 

TABLE 1 





HOXAl 


HOXBJ 


HOXAI or HOXBl 


Historic controls (N » 101) 


16 


34 


47 


Parent controls (N = 36) 


22 


39 


55t 


Probands with ASD (N = 46) 


35** 


52* 


SO*** 


Other afrccled relatives (N « 24) 


38* 


42 


75* 



55 



different from historical controls: * » p < .05, ** » p < .01, *** = p < 60 
.001 

different from probands: t " <.05 

Table 1 demonstrates that parent controls are, indeed, 
similar to historic controls in their rates of the polymor- 
phisms under study, while affected family members are 65 
similar to probands. This is especially true when the two 
functionally-related genes are combined. Eighty percent of 



probands have one deviation from the normal sequence or 
the other, while only 47% of historical controls have an 
anomaly. Parent controls (untranslated alleles) match the 
historical controls in their rate of abnormal alleles, indicat- 
ing that the local population is not structured differently 
from the general population in its rate of these alleles. In 
contrast, both probands (x^«14.83, p<().()01) and other 
affected family members (x^=6.30, p<0.02) differ signifi- 
cantly from historical controls. The probands differ signifi- 
cantly from the parent controls, as well x^=4.08, p<0.05). 
The probands with genetic anomalies of HoxAl or HoxBl 
arc concordant with the other affected members of the family 
in 18/22 cases (y/=17.82, p<0.001). FinaUy, both the I IoxAl 
and HoxBl polymorphisms are significantly associated with 
autism as judged by the Transmission Disequilibrium Test 
for Association (Spielman and Ewens, 1996), which com- 
pares the rate of transmission "into the disease" to the 50% 
rate one would expect in offspring of parents with the allele 
of interest. The x^s for this test are: HoxAl =»5.16, p<0.05; 
HoxBl=4.67, p<0.05. 

In addition to the living probands, it was of interest to 
determine the genotype of the patient whose brain anatomy 
first suggested the involvement of the Hox genes in autism 
(Rodier et al., 1996). Genomic DNA was extracted from the 
autopsy tissue, and the patient was determined to have the 
Bl polymorphism (Stodgell el al., 1998). 

One proband is homozygous for the less common allele of 
HoxAl, and he is severely affected. He was diagnosed early, 
at 21 months. None of the historic controls, and no parents, 
were homozygous for the polymorphism. Homozygosity of 
the HoxBl polymorphism occurred in two historic controls, 
one affected parent, and in two severely-affected probands. 
Larger samples are needed to determine whether either 
polymorphism reduces viability. Three probands have both 
polymorphisms, and are .severely disabled. The detection 
and description of the polymorphisms in the first axons of 
HoxAl and HoxBl and the progress of the association 
studies have been described in a book chapter and two 
abstracts (Rodier, 1998; Ingram et al., 1997; Stodgell et al., 
1998). 

Example 7 

Identification of a Second Polymorphism in HoxAl 

A third polymophism has been detected in the homeobox 
region of HoxAl in the second exon. The second exon 
cannot be amplified by PGR from the DNA of four probands 
indicating that an anomaly exists. This indicates that they are 
homozygous for a deviation from the published sequence on 
which the primers for the exon were based. PGR amplifi- 
cation yields suggest that about ten other probands are 
heterozygotes for this polymorphism of the second exon of 
HoxAl. 

Additional primers have been developed that will allow 
complete sequencing of the altered region, which appears to 
be at the 3' end of the homeobox. Once the .sequence is 
established, a test (such as the use of restriction length 
polymorphisms) can be developed to allow rapid evaluation 
of DNA samples. The degree of association of this poly- 
morphism with autism spectrum disorders will then be 
studied in the same groups already evaluated for the others. 
Other studies in progress are designed to examine the second 
exon of HoxBl and the non-coding regions of both genes. 

Example 8 

Identification of Additional Polymorphisms in HoxBl and 
HoxDl Associated with Autism 

The procedures for evaluating the candidate gene HoxDl, 
as well as for finding additional polymorphisms in HoxAl 
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and HoxBl, will be the same as for those already identified 
in HoxAl and HoxBl. Mutation detection in the coding 
sequence of these genes will consist of PGR amplification, 
cloning and sequencing. Mutation detection for the entire 
genes will include large deletion/insertion analysis by 5 
Southern blotting, analysis of 2(K)-4(K) bp fragments by 
vSSCP or heteroduplex analysis, and of course cloning and 
sequencing when heterozygosity becomes apparent for any 
region of the genes. Current Protocols in Hwnan Genetics 
(John Wiley & Sons, Inc.), Chapter 7, "Searching Candidate lo 
Genes for Mutations." 
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Biological samples already isolated from patients with 
autism which did not show any abnormalities in HoxAl or 
HoxBl will be screened for polymorphisms in HoxDl. 

Although preferred embodiments have been depicted and 
described in detail herein, it will be apparent to those skiUed 
in the relevant art that various modifications, additions, 
substitutions, and the like can be made without departing 
from the spirit of the invention and these therefore are 
considered within the scope of the invention as defined in the 
claims which follow. 



SEQUENCE LISTING 

(1) GENEEtAL INFORMATION: 

(iii) NUMBER OF SEQUENCES: 10 

(2) INFORMATION FOR SEQ ID NO:li 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

ATGGACAATG CAAGAATGAA CTCCTTCCTG GAATACCCCA TACTTAGCAG TGGCGACTCG 60 

GGGACCTGCT CAGCCCGAGC CTACCCCTCG GACCATAGGA TTACAACTTT CCAGTCGTGC 120 

GCGGTCAGCG CCAACAGTTG CGGCGGCGAC GACCGCTTCC TAGTGGGCAG GGGGGTGCAG 180 

ATCGGTTCGC CCCACCACCA CCACCACCAC CACCATCACC ACCCCCAGCC GGCTACCTAC 240 

CAGACTTCCG GGAACCTGGG GGTGTCCTAC TCCCACTCAA GTTGTGGTCC AAGCTATGGC 300 

TCACAGAACT TCAGTGCGCC TTACAGCCCC TACGCGTTAA ATCAGGAAGC AGACGTAAGT 3 60 

GGTGGGTACC CCCAGTGCGC TCCCGCTGTT TACTCTGGAA ATCTCTCATC TCCCATGGTC 420 

CAGCATCACC ACCACCACCA GGGTTATGCT GGGGGCGCGG TGGGCTCGCC TCAATACATT 480 

CACCACTCAT ATGGACAGGA GCACCAGAGC CTGGCCCTGG CTACGTATAA TAACTCCTTG 540 

TCCCCTCTCC ACGCCAGCCA CCAAGAAGCC TGTCGCTCCC CCGCATCGGA GACATCTTCT 600 

CCAGCGCAGA CTTTTGACTG GATGAAAGTC AAAAGAAACC CTCCCAAAAC AGGGAAAGTT 660 

GGAGAGTACG GCTACCTGGG TCAACCCAAC GCGGTGCGCA CCAACTTCAC TACCAAGCAG 720 

CTCACGGAAC TGGAGAAGGA GTTCCACTTC AACAAGTACC TGACGCGCGC CCGCAGGGTG 780 

GAGATCGCTG CATCCCTGCA GCTCAACGAG ACCCAAGTGA AGATCTGGTT CCAGAACCGC 840 

CGAATGAAGC AAAAGAAACG TGAGAAGGAG GGTCTCTTGC CCATCTCTCC GGCCACCCCG 900 

CCAGGAAACG ACGAGAAGGC CGAGGAATCC TCAGAGAAGT CCAGCTCTTC GCCCTGCGTT 960 

CCTTCCCCGG GGTCTTCTAC CTCAGACACT CTGACTACCT CCCACTGA 1008 

(2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



us 6,228,582 Bl 



25 



26 



-continued 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Asp Asn Ala Arg Met Asn Ser Phe Leu Glu Tyr Pro He Leu Ser 
15 10 15 

Ser Gly Asp Ser Gly Thr Cys Ser Ala Arg Ala Tyr Pro Ser Asp His 
20 25 30 

Arg He Thr Thr Phe Gin Ser Cys Ala Val Ser Ala Asn Ser Cys Gly 
35 40 45 

Gly Asp Asp Arg Phe Leu Val Gly Arg Gly Val Gin He Gly Ser Pro 

50 55 60 

His His His His His His His His His His Pro Gin Pro Ala Thr Tyr 
65 70 75 80 

Gin Thr Ser Gly Asn Leu Gly Val Ser Tyr Ser His Ser Ser Cys Gly 
85 90 95 

Pro Ser Tyr Gly Ser Gin Asn Phe Ser Ala Pro Tyr Ser Pro Tyr Ala 
100 105 110 

Leu Asn Gin Glu Ala Asp Val Ser Gly Gly Tyr Pro Gin Cys Ala Pro 
115 120 125 

Ala Val Tyr Ser Gly Asn Leu Ser Ser Pro Met Val Gin His His His 
130 135 140 

His His Gin Gly Tyr Ala Gly Gly Ala Val Gly Ser Pro Gin Tyr He 
145 150 155 160 

His His Ser Tyr Gly Gin Glu His Gin Ser Leu Ala Leu Ala Thr Tyr 
165 170 175 

Asn Asn Ser Leu Ser Pro Leu His Ala Ser His Gin Glu Ala Cys Arg 
180 185 190 

Ser Pro Ala Ser Glu Thr Ser Ser Pro Ala Gin Thr Phe Asp Trp Met 

195 200 205 

Lys Val Lys Arg Asn Pro Pro Lys Thr Gly Lys Val Gly Glu Tyr Gly 
210 215 220 

Tyr Leu Gly Gin Pro Asn Ala Val Arg Thr Asn Phe Thr Thr Lys Gin 
225 230 235 240 

Leu Thr Glu Leu Glu Lys Glu Phe His Phe Asn Lys Tyr Leu Thr Arg 
245 250 255 

Ala Arg Arg Val Glu He Ala Ala Ser Leu Gin Leu Asn Glu Thr Gin 
260 265 270 

Val Lys He Trp Phe Gin Asn Arg Arg Met Lys Gin Lys Lys Arg Glu 
275 280 285 

Lys Glu Gly Leu Leu Pro He Ser Pro Ala Thr Pro Pro Gly Asn Asp 
290 295 300 

Glu Lys Ala Glu Glu Ser Ser Glu Lys Ser Ser Ser Ser Pro Cys Val 
305 310 315 320 

Pro Ser Pro Gly Ser Ser Thr Ser Asp Thr Leu Thr Thr Ser His 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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-continued 



ATGGACAATG CAAGAATGAA CTCCTTCCT6 GAATACCCCA TACTTAGCAG TGGCGACTCG 60 

GGGACCTGCT CAGCCCGAGC CTACCCCTCG GACCATAGGA TTACAACTTT CCAGTCGTGC 120 

GCGGTCAGCG CCAACAGTTG CGGCGGCGAC GACCGCTTCC TAGTGGGCAG GGGGGTGCAG 180 

ATCGGTTCGC CCCACCACCA CCACCACCAC CACCATCGCC ACCCCCAGCC GGCTACCTAC 240 

CAGACTTCCG GGAACCTGGG GGTGTCCTAC TCCCACTCAA GTTGTGGTCC AAGCTATGGC 300 

TCACAGAACT TCAGTGCGCC TTACAGCCCC TACGCGTTAA ATCAGGAAGC AGACGTAAGT 360 

GGTGGGTACC CCCAGTGCGC TCCCGCTGTT TACTCTGGAA ATCTCTCATC TCCCATGGTC 420 

CAGCATCACC ACCACCACCA GGGTTATGCT GGGGGCGCGG TGGGCTCGCC TCAATACATT 480 

CACCACTCAT ATGGACAGGA GCACCAGAGC CTGGCCCTGG CTACGTATAA TAACTCCTTG 540 

TCCCCTCTCC ACGCCAGCCA CCAAGAAGCC TGTCGCTCCC CCGCATCGGA GACATCTTCT 600 

CCAGCGCAGA CTTTTGACTG GATGAAAGTC AAAAGAAACC CTCCCAAAAC AGGGAAAGTT 660 

GGAGAGTACG GCTACCTGGG TCAACCCARC GCGGTGCGCA CCAACTTCAC TACCAAGCAG 720 

CTCACGGAAC TGGAGAAGGA GTTCCACTTC AACAAGTACC TGACGCGCGC CCGCAGGGTG 780 

GAGATCGCTG CATCCCTGCA GCTCAACGAG ACCCAAGTGA AGATCTGGTT CCAGAACC6C 840 

CGAATGAAGC AAAAGAAACG TGAGAAGGAG GGTCTCTTGC CCATCTCTCC GGCCACCCCG 90 0 

CCAGGAAACG ACGAGAAGGC CGAGGAATCC TCAGAGAAGT CCAGCTCTTC GCCCTGCGTT 96 0 

CCTTCCCCGG GGTCTTCTAC CTCAGACACT CTGACTACCT CCCACTGA 1008 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Asp Asn Ala Arg Met Asn Ser Phe Leu Glu Tyr Pro He Leu Ser 
15 10 15 

Ser Gly Asp Ser Gly Thr Cys Ser Ala Arg Ala Tyr Pro Ser Asp His 
20 25 30 

Arg He Thr Thr Phe Gin Ser Cys Ala Val Ser Ala Asn Ser Cys Gly 
35 40 45 

Gly Asp Asp Arg Phe Leu Val Gly Arg Gly Val Gin He Gly Ser Pro 
50 55 60 

His His His His His His His His Arg His Pro Gin Pro Ala Thr Tyr 
65 70 75 80 

Gin Thr Ser Gly Asn Leu Gly Val Ser Tyr Ser His Ser Ser Cys Gly 
85 90 95 

Pro Ser Tyr Gly Ser Gin Asn Phe Ser Ala Pro Tyr Ser Pro Tyr Ala 

100 105 110 

Leu Asn Gin Glu Ala Asp Val Ser Gly Gly Tyr Pro Gin Cys Ala Pro 
115 120 125 

Ala Val Tyr Ser Gly Asn Leu Ser Ser Pro Met Val Gin His His His 
130 135 140 

His His Gin Gly Tyr Ala Gly Gly Ala Val Gly Ser Pro Gin Tyr He 
145 150 155 160 

His His Ser Tyr Gly Gin Glu His Gin Ser Leu Ala Leu Ala Thr Tyr 
165 170 175 
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A6n Asn Ser Leu Ser Pro Leu His Ala Ser His Gin Glu Ala Cys Arg 
180 185 190 

Ser Pro Ala Ser Glu Thr Ser Ser Pro Ala Gin Thr Phe Asp Trp Met 
195 200 205 

Lys Val Lys Arg Asn Pro Pro Lys Thr Gly Lys Val Gly Glu Tyr Gly 
210 215 220 

Tyr Leu Gly Gin Pro Asn Ala Val Arg Thr Asn Phe Thr Thr Lys Gin 

225 230 235 240 

Leu Thr Glu Leu Glu Lys Glu Phe His Phe Asn Lys Tyr Leu Thr Arg 
245 250 255 

Ala Arg Arg Val Glu lie Ala Ala Ser Leu Gin Leu Asn Glu Thr Gin 
260 265 270 

Val Lys lie Trp Phe Gin Asn Arg Arg Met Lys Gin Lys Lys Arg Glu 
275 280 285 

Lys Glu Gly Leu Leu Pro lie Ser Pro Ala Thr Pro Pro Gly Asn Asp 

290 295 300 

Glu Lys Ala Glu Glu Ser Ser Glu Lys Ser Ser Ser Ser Pro Cys Val 
305 310 315 320 

Pro Ser Pro Gly Ser Ser Thr Ser Asp Thr Leu Thr Thr Ser His 
325 330 335 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LEKGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCCA ACCTCCTTTC CCCCAAGCTC GGCTCAGGCG 120 

GTTGACAGCT ATGCAAGCGA GGGCCGCTAC GGTGGGGGGC TGTCCAGCCC TGCGTTTCAG 180 

CAGAACTCCG GCTATCCCGC CCAGCAGCCG CCTTCGACCC TGGGGGTGCC CTTCCCC24GC 240 

TCCGCGCCCT CGGGGTATGC TCCTGCCGCC TGCAGCCCCA GCTACGGGCC TTCTCAGTAC 300 

TACCCTCTGG GTCAATCAGA AGGAGACGGA GGCTATTTTC ATCCCTCGAG CTACGGGGCC 360 

CAGCTAGGGG GCTTGTCCGA TGGCTACGGA GCAGGTGGAG CCGGTCCGGG GCCATATCCT 420 

CCGCAGCATC CCCCTTATGG GAACGAGCAG ACCGCGAGCT TTGCACCGGC CTATGCTGAT 480 

CTCCTCTCCG AGGACAAGGA AACACCCTGC CCTTCAGAAC CTAACACCCC CACGGCCCGG 540 

ACCTTCGACT GGATGAAGGT TAAGAGAAAC CCACCCAAGA CAGCGAAGGT GTCAGAGCCA 600 

GGCCTGGGCT CGCCCAGTGG CCTCCGCACC AACTTCACCA CAAGGCAGCT GACAGAACTG 660 

GAAAAGGAGT TCCATTTCAA CAAGTACCTG AGCCGGGCCC GGAGGGTGGA GATTGCCGCC 720 

ACCCTGGAGC TCAATGAAAC ACAGGTCAAG ATTTGGTTCC AGAACCGACG AATGAAGCAG 780 

AAGAAGCGCG AGCGAGAGGG AGGTCGGGTC CCCCCAGCCC CACCAGGCTG CCCCAAGGAG 840 

GCAGCTGGAG ATGCCTCAGA CCAGTCGACA TGCACCTCCC CGGAAGCCTC ACCCAGCTCT 900 

GTCACCTCCT GAACTGAACC TAGCCACCAA TGGGGCTTCC AGGCACTGGA GCGCCCCAGT 960 

CCAGCCCTAT CCCAGGCTCT CCCAACCCAG GCCTGGCTTC ACTGCCTGGG ATCTCTAGGC 1020 

T 1021 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 301 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

Met Asp Tyr Asn Arg Met Asn Ser Phe Leu Glu Tyr Pro Leu Cys Asn 
15 10 15 

Arg Gly Pro Ser Ala Tyr Ser Ala His Ser Ala Pro Thr Ser Phe Pro 

20 25 30 

Pro Ser Ser Ala Gin Ala Val Asp Ser Tyr Ala Ser Glu Gly Arg Tyr 
35 40 45 

Gly Gly Gly Leu Ser Ser Pro Ala Phe Gin Gin Asn Ser Gly Tyr Pro 
50 55 60 

Ala Gin Gin Pro Pro Ser Thr Leu Gly Val Pro Phe Pro Ser Ser Ala 
65 70 75 80 

Pro Ser Gly Tyr Ala Pro Ala Ala Cys Ser Pro Ser Tyr Gly Pro Ser 

85 90 95 

Gin Tyr Tyr Pro Leu Gly Gin Ser Glu Gly Asp Gly Gly Tyr Phe His 
100 105 110 

Fro Ser Ser Tyr Gly Ala Gin Leu Gly Gly Leu Ser Asp Gly Tyr Gly 
115 120 125 

Ala Gly Gly Ala Gly Pro Gly Pro Tyr Pro Pro Gin His Pro Pro Tyr 
130 135 140 

Gly Asn Glu Gin Thr Ala Ser Phe Ala Pro Ala Tyr Ala Asp Leu Leu 
145 150 155 160 

Ser Glu Asp Lys Glu Thr Pro Cys Pro Ser Glu Pro Asn Thr Pro Thr 
165 170 175 

Ala Arg Thr Phe Asp Trp Met Lys Val Lys Arg Asn Pro Pro Lys Thr 
180 185 190 

Ala Lys Val Ser Glu Pro Gly Leu Gly Ser Pro Ser Gly Leu Arg Thr 
195 200 205 

Asn Phe Thr Thr Arg Gin Leu Thr Glu Leu Glu Lys Glu Phe His Phe 
210 215 220 

Asn Lys Tyr Leu Ser Arg Ala Arg Arg Val Glu lie Ala Ala Thr Leu 
225 230 235 240 

Glu Leu Asn Glu Thr Gin Val Lys lie Trp Phe Gin Asn Arg Arg Met 
245 250 255 

Lys Gin Lys Lys Arg Glu Arg Glu Gly Gly Arg Val Pro Pro Ala Pro 
260 265 270 

Fro Gly Cys Pro Lys Glu Ala Ala Gly Asp Ala Ser Asp Gin Ser Thr 
275 280 285 

Cys Thr Ser Pro Glu Ala Ser Pro Ser Ser Val Thr Ser 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1030 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

TGACGCATGG ACTATAATAG GATGAACTCC TTCTTAGAGT ACCCACTCTG TAACCGGGGA 60 

CCCAGCGCCT ACAGCGCCCA CAGCGCCCAC AGCGCCCCAA CCTCCTTTCC CCCAAGCTCG 120 

GCTCAGGCGG TTGACAGCTA TGCAAGCGAG GGCCGCTACG GTGGGGGGCT GTCCAGCCCT 180 

GCGTTTCAGC AGAACTCCGG CTATCCCGCC CAGCAGCCGC CTTCGACCCT GGGGGTGCCC 240 

TTCCCCAGCT CCGCGCCCTC GGGGTATGCT CCTGCCGCCT GCAGCCCCAG CTACGGGCCT 300 

TCTCAGTACT ACCCTCTGGG TCAATCAGAA GGAGACGGAG GCTATTTTCA TCCCTCGAGC 360 

TACGGGGCCC AGCTAGGGGG CTTGTCCGAT GGCTACGGAG CAGGTGGAGC CGGTCCGGGG 420 

CCATATCCTC CGCAGCATCC CCCTTATGGG AACGAGCAGA CCGCGAGCTT TGCACCGGCC 480 

TATGCTGATC TCCTCTCCGA GGACAAGGAA ACACCCTGCC CTTCAGAACC TAACACCCCC 540 

ACGGCCCGGA CCTTCGACTG GATGAAGGTT AAGAGAAACC CACCCAAGAC AGCGAAGGTG 600 

TCAGAGCCAG GCCTGGGCTC GCCCAGTGGC CTCCGCACCA ACTTCACCAC AAGGCAGCTG 660 

ACAGAACTGG AAAAGGAGTT CCATTTCAAC AAGTACCTGA GCCGGGCCCG GAGGGTGGAG 720 

ATTGCCGCCA CCCTGGAGCT CAATGAAACA CAGGTCAAGA TTTGGTTCCA GAACCGACGA 780 

ATGAAGCAGA AGAAGCGCGA GCGAGAGGGA GGTCGGGTCC CCCCAGCCCC ACCAGGCTGC 840 

CCCAAGGAGG CAGCTGGAGA TGCCTCAGAC CAGTCGACAT GCACCTCCCC GGAAGCCTCA 900 

CCCAGCTCTG TCACCTCCTG AACTGAACCT AGCCACCAAT GGGGCTTCCA GGCACTGGAG 960 

CGCCCCAGTC CAGCCCTATC CCAGGCTCTC CCAACCCAGG CCTGGCTTCA CTGCCTGGGA 1020 

TCTCTAGGCT 1030 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 

(B) TYPE: eunino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Met Asp Tyr Asn Arg Met Asn Ser Phe Leu Glu Tyr Pro Leu Cys Asn 
15 10 15 

Arg Gly Pro Ser Ala Tyr Ser Ala His Ser Ala His Ser Ala Pro Thr 
20 25 30 

Ser Phe Pro Pro Ser Ser Ala Gin Ala Val Asp Ser Tyr Ala Ser Glu 
35 40 45 

Gly Arg Tyr Gly Gly Gly Leu Ser Ser Pro Ala Phe Gin Gin Asn Ser 

50 55 60 

Gly Tyr Pro Ala Gin Gin Pro Pro Ser Thr Leu Gly Val Pro Phe Pro 
65 70 75 80 

Ser Ser Ala Pro Ser Gly Tyr Ala Pro Ala Ala Cys Ser Pro Ser Tyr 
85 90 95 

Gly Pro Ser Gin Tyr Tyr Pro Leu Gly Gin Ser Glu Gly Asp Gly Gly 
100 105 110 

Tyr Phe His Pro Ser Ser Tyr Gly Ala Gin Leu Gly Gly Leu Ser Asp 

115 120 125 

Gly Tyr Gly Ala Gly Gly Ala Gly Pro Gly Pro Tyr Pro Pro Gin His 
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130 135 140 

Pro Pro Tyr Gly Aen Glu Gin Thr Ala Ser Phe Ala Pro Ala Tyr Ala 
145 150 155 160 

Asp Leu Leu Ser Glu Asp Lys Glu Thr Pro Cys Pro Ser Glu Pro Asn 
165 170 175 

Thr Pro Thr Ala Arg Thr Phe Asp Trp Met Lys Val Lys Arg Asn Pro 
180 185 190 

Pro Lys Thr Ala Lys Val Ser Glu Pro Gly Leu Gly Ser Pro Ser Gly 
195 200 205 

Leu Arg Thr Asn Phe Thr Thr Arg Gin Leu Thr Glu Leu Glu Lys Glu 
210 215 220 

Phe His Phe Asn Lys Tyr Leu Ser Arg Ala Arg Arg Val Glu He Ala 

225 230 235 240 

Ala Thr Leu Glu Leu Asn Glu Thr Gin Val Lys He Trp Phe Gin Asn 
245 250 255 

Arg Arg Met Lys Gin Lys Lys Arg Glu Arg Glu Gly Gly Arg Val Pro 
260 265 270 

Pro Ala Pro Pro Gly Cys Pro Lys Glu Ala Ala Gly Asp Ala Ser Asp 
275 280 285 

Gin Ser Thr Cys Thr Ser Pro Glu Ala Ser Pro Ser Ser Val Thr Ser 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "primer" 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 

GCATGGACTA TAATAGGATG 20 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TCTTGGGTGG GTTTCTCTTA 20 



What is claimed: 

1. A method for screening subjects for genetic markers 
associated with autism, comprising: 

isolating a biological sample from a mammal; and 
testing the sample or genetic material isolated from the 
sample for a polymorphism in a Hox Al or Bl coding 
sequence which is a genetic marker for autism. 

2. The method according to claim 1, wherein the biologi- 
cal sample is selected from the group consisting of blood, 
saliva, amniotic fluid, and tissue. 

3. The method according to claim 2, wherein the biologi- 
cal sample is blood. 



4. The method according to claim 1, wherein the mammal 
is a human. 

5. llie method according to claim 4, wherein the biologi- 
cal sample is isolated from developmentally disabled chil- 
dren. 

6. The method according to claim 4, wherein the biologi- 
cal sample is isolated from parents or relatives of develop- 
mentally disabled children. 

65 7. The method according to claim 4, wherein the biologi- 
cal sample is isolated from children and said method further 
comprises: 
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early behavior training for children having genetic mark- 
ers associated with autism. 

8. The method according to claim 1, wherein the poly- 
morphism is located in the homeobox. 

9. The method according to claim 1, wherein the coding 
sequence has a single base substitution resulting in an amino 
acid substitution. 

10. The method according to claim 9, wherein the amino 
acid substitution is an arginine for a histidine. 

11. The method according to claim 10, wherein the coding 
sequence has an insertion, 

12. The method according to claim 11, wherein the 
insertion is 5' ACAGCGCCC-3'. 

13. The method according to claim 1, wherein the coding 
sequence has a polymorphism selected from the group 
consisting of a single base substitution resulting in an amino 
acid substitution, a single base substitution resulting in a 
translational stop, an insertion, a deletion, and a rearrange- 
ment. 

14. The method according to claim 1, wherein the poly- 
morphism alters the sequence of the polypeptide encoded by 
the coding sequence. 

15. The method according to claim 1, wherein said 
screening for mutated nucleic acids is carried out by a 
method selected from the group consisting of direct 
sequencing of nucleic acids, single strand polymorphism 
assay, restriction fragment length polymorphism assay, 
hgase chain reaction, enzymatic cleavage and southern 
hybridization. 

16. The method according to claim 15, wherein said 
screening is carried out by direct sequencing of nucleic 
acids. 

17. i'he method according to claim 15, wherein said 
screening is carried out by single strand polymorphism 

assay. 

18. The method according to claim 15, wherein said 
screening is carried out by restriction fragment length poly- 
morphism assay. 
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19. The method according to claim 15, wherein said 
screening is carried out by ligase chain reaction. 

20. The method according to claim 15, wherein said 
screening is carried out by enzymatic cleavage. 

5 21. The method according to claim 15, wherein said 
screening is carried out by southern hybridization. 

22. The method according to claim 15, wherein the 
nucleic acid is a deoxyribonucleic acid. 

23. The method according to claim 15, wherein the 
nucleic acid is a messenger ribonucleic acid. 

24. An isolated nucleic acid molecule comprising the 
nucleotide sequence set forth in SEQ ID NO: 1, wherein the 
nucleic acid molecule comprises a single base substitution at 
nucleotide 218. 

25. The isolated nucleic acid molecule comprising the 
nucleotide sequence set forth in SEQ ID N0:5, wherein the 
nucleic acid molecule comprises an insertion between nucle- 
otides 88 and 89. 

20 26. The isolated nucleic acid molecule according to claim 
25, wherein the insertion is 5'-ACAGCGCCC-3'. 

27. An isolated nucleic acid molecule consisting of at least 
15 contiguous nucleotides of the coding sequence set forth 
in SEQ ID N0:5 wherein the molecule comprises an inser- 

25 tion between nucleotides 88 and 89 in SEQ ID N0:5 and 
wherein the molecule specifically binds to a IIoxAl or 
HoxBl coding sequence but does not bind to other coding 
sequences. 

28. An isolated nucleic acid molecule consisting of at least 
30 15 contiguous nucleotides of the coding sequence set forth 

in SEQ ID NO: 1 wherein the molecule comprises a single 
base substitution at nucleotide 218 and wherein the molecule 
specifically binds to a HoxAl or HoxBl coding sequence 
but does not bind to other coding sequences. 
35 29. The method according to claim 1 wherein the coding 
sequence has a mutation in an exon. 

* * * * * 
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[57] ABSTRACT 

Genetic markers associated with programmed cell death 
were characterized and their extent of polymorphism in 
normal populations was determined allowing for a method 
for determining genetic predisposition to SLE and other 
autoimmune diseases by genotyping. The allelic distribution 
of these gene markers in a large Mexican American SLE 
cohort and ethnically matched controls was determined. The 
results were that bcl-2, Fas-L, and IL-10 loci showed 
significantly different allelic distribution in SLE patients 
compared with controls, indicating an association between 
these gens and SLE. The method allows for determining the 
presence of these alleles. Alone, the presence of each of 
these alleles is associated with a moderate increase in SLE 
risk, while the occurrence of these alles together increases 
the odds of developing SLE by more than 40-fold. 

10 Claims, No Drawings 
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METHODS FOR DETERMINING GENETIC 

PREDISPOSITION TO AUTOIMMUNE 
DISEASES BY GENOTYPING APOPTOTIC 
GENES 

5 

FIELD OF TIIE INVENTION 

This invention relates generally to naethods for determin- 
ing predisposition to systemic lupus erythematosus (SLE) 
and other autoimmune diseases by genotyping IL-10, bcl-2, 
FAS ligand (FAS-L) and other apoptotic genes. More 
specifically, the bcl-2, Fas-L, and IL-10 loci showed signifi- 
cantly different allelic distribution in SLE patients compared 
with controls, indicating an association between these genes 
and SLE. Additionally, further analysis revealed a synergis- 
tic effect between susceptibility alleles of the bcl-2 and 
lL-10 genes in determining disease susceptibility. 

BACKGROUND OF THE INVENTION 

Systemic lupus erythematosus (SLE) is considered to be 
the prototype of human autoimmune diseases. It is a disorder 
of generalized autoimmunity characterized by multisystem 
organ involvement, polyclonal B cell activation, and the 
production of autoantibodies against nuclear, cytoplasmic, 
and cell surface antigens. Autoreactive B and T lymphocytes 25 
can be found in healthy individuals as well, but their 
numbers are tightly regulated by a process of programmed 
cell death (apoptosis), which is crucial in the establishment 
of self-tolerance. Tolerance to self antigens can fail and can 
result in autoimmunity if there is a defect in the process of 33 
elimination of these cells. 

SLE, as well as most other autoimmune diseases is 
difficult to diagnose. A strict definition of SLE patients 
included 4 or more of the 11 ACR revised criteria for SLE, 
eliminating LE cells but adding anticardiolipin antibodies 35 
and lupus anticoagulant as criteria (Tan, et al. Arthritis 
Rheum. 1982:25:1271-7). Often, a patient that is going to 
develop SLE will be kept off of treatment because they only 
show two or three of the criteria. One of the major tests, the 
ANA test (anti-nuclear antibodies), tests for the presence of 40 
these antibodies. However, 15-20% of those individuals 
with a positive ANA will never develop disease. The inad- 
equacy of definitive tests for the diagnosis of autoimmune 
diseases is a recurrent theme. For this reason, treatment is 
often not started until disease is too far along and irreversible 45 
damage has occurred. Therefore development of a test for 
the diagnosis and susceptibility to autoimmune diseases 
could have a profound effect on the outcome of the disease 
and the patient's quality of life. 

Several lines of evidence suggest that dysfunctional pro- 50 
grammed cell death (apoptosis) might be involved in the 
pathogenesis of SLE and other autoimmune diseases. It has 
been postulated that in SLE, dysfunction of apoptosis could 
resuh in the inappropriate longevity of autoreactive B 
lymphocytes, allowing autoantibody levels to reach patho- 55 
genie thresholds and breakdown of self tolerance. Defective 
apoptosis of autoreactive lymphocytes is an attractive 
mechanism contributing to SLE, primarily because defects 
in either the apoptosis-promoting Fas gene or its ligand 
Fas-L (CD95L) accelerates autoimmunity in mouse strains 60 
(MRL-lpr/lpr and C3H-gld/gld, res}xjctively) that exhibit 
SLE-like diseases. Furthermore, studies reveal links 
between autoimmunity and several other gene products 
involved in apoptosis. The bcl-2 gene enhances lymphocyte 
survival by inhibiting or delaying apoptosis. Transgenic 65 
mice overexprcssing bcl-2 in their B cells show polyclonal 
B cell expansion and extended survival in vitro. After a few 
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months, these mice developed an autoimmune syndrome 
resembling SLE. 

Interleukin-lO (IL-10) is a pleiotropic cytokine that regu- 
lates many immune and inflammatory responses. Among 
other activities, this cytokine increases the survival of acti- 
vated lymphocytes. Furthermore, administration of recom- 
binant IL-10 to lupusprone (New Zealand blackxNew 
Zealand white)Fi ([NZBxNZW]Fj) mice accelerates the 
development of autoimmunity. CTLA-4 is an additional 
gene involved in apoptosis that has been suggested to be 
associated with autoimmune disease development. CTLA4 
can mediate antigen-specific apoptosis and appears to be 
part of a distinct signaling pathway capable of clonally 
deleting previously activated human T lymphocytes. 
CTLA-4 also warrants further study because it may be a 
candidate gene in more than one autoimmune disease. 
CTLA-4 was reported to be associated with 2 autoimmune 
diseases. Grave's disease and insulin-dependent diabetes 
mellitus. 

In a recent publication, Eskdale et al (Tissue Antigens 
1997:49:635-9) have shown an association between an 
IL-10 microsateUite polymorphism and SLE in a Caucasian 
population. In this study a group of 56 Caucasian SLE 
patients firom Great Britain were compared with 102 ethni- 
cally matched controls. However, because of the moderate 
sample size, the results were considered only as a framework 
for further study. 

SUMMARY OF THE INVENTION 

One object of the present invention is a method for 
determining predisposition to an autoimmune disease by 
obtaining a patient sample, amplifying at least two apoptotic 
loci, and determining whether the disease-specific allele is 
present. 

A further embodiment includes identifying the disease- 
specific alleles by comparing the most abundant allele found 
in patients with disease to normal individuals. Preferably the 
apoptotic gene loci are selected from the group consisting of 
IL-10, bcl-2, Fas-L, and CTLA-4. Preferably, the IL-10 
disease -associated allele is PCR amplified with the primers 
comprising SEQ ID N0:1 and SEQ ID N0:2, the bcl-2 
disease-associated allele is PCR amplified with the primers 
comprising SEQ ID N0;3 and SEQ ID N0:4, the Fas-L 
disease -associated allele Is PCR amplified with the primers 
comprising SEQ ID N0:5 and SEQ ID N0:6, and the 
CTLA-4 disease-associated allele is PCR amplified with the 
primers comprising SEQ ID N0:7 and SEQ ID N0:8. In a 
further preferred embodiment, the disease-associated allele 
is identified by size or sequence. Preferably, the disease is 
selected from the group consisting of; systemic lupus 
erythematosis, thyroid autoimmunity syndromes, insulin 
dependent diabetes mellitis, inflammatory bowel disease, 
rheumatoid arthritis and other arthritidies. 

A further object of the invention is a kit for determining 
predisposition to an autoimmune disease comprising the 
method of claim 1. 

A further object of the invention is a method for producing 
a diagnostic lest for predispostion to an autoimmune di.sease 
which involves obtaining a patient sample, PCR amplifying 
at least two apoptotic loci, and identifying the disease- 
specific alleles by comparison to normal individuals, finally 
determining whether the disease-specific allele is present in 
a test patient's sample. 

DETAILED DESCRIPTION OF THE 
INVENTION 

Because bcl-2, Fas-L, CTLA-4, and IL-10 participate in 
apoptosis, and because of the evidence suggesting that these 
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genes may be involved in the pathogenesis of SLE, we tested 
whether there is an association between these genes and SLE 
in humans. 

The method of the present invention comprises a tech- 
nique for determining the presence of disease -associated ^ 
alleles of apoptotic genes and analyzing whether they show 
predisposition to autoimmune diseases. 

Further features and advantages will become apparent to 
those of skill in the art in view of the Detailed Description 
of the Invention which follows, when considered together 
with the attached claims. 

Although other materials and methods can be used in the 
practice or testing of the present invention, a method is now 
described. Examples 1-3 show how a method for determin- 
ing predisposition to an autoimmune disease can be devel- 
oped. 

EXAMPLE 1 

Characteristics of the Study Population 20 

Patients in this study were from the University of South- 
era California (USC) School of Medicine cUnics who were 
confirmed to have met the American College of Rheuma- 
tology (ACR) criteria for SLE. A strict definition of SLE 25 
patients included 4 or more of the 11 ACR revised criteria 
for SLE, eliminating LE cells but adding anticardiolipin 
antibodies and lupus anticoagulant as criteria (Tan, et al. 
Arthritis Rheum. 1982:25:1271-7). 

We used semistruciured personal or telephone interviews 30 
to obtain a complete family history of each SLE patient and 
control subject. Through these interviews, data were col- 
lected describing a fixed family structure (proband's 
grandparents, parents, siblings, and ofikpring, as well as 
sibUngs and offspring of both of the proband's parents). 35 
Information regarding the birthplace of the probands, their 
parents, and their grandparents was also obtained. Whenever 
possible, we obtained family history information about the 
probands from an additional source (usually, a parent of the 
subject). 40 

For the purpose of this study, Mexican Americans were 
defined as individuals bom in Mexico or the US whose 
grandparents from both the mother's and the father's side 
were bom in Mexico. Controls were defined as Mexican 
American subjects who did not have SLE or any other 45 
autoimmune disease and whose family lacked any autoim- 
mune disease history. The study protocol was approved by 
the Institutional Review Board of the USC School of Medi- 
cine. 

50 

EXAMPLE 2 

Genotypic Analysis of lL-10, bcl-2, Fas-L, and 
CTLA-4 

Blood samples were collected from all participants, and 55 
genomic DNA was extracted from the peripheral blood 
mononuclear cells by standard procedures. To obtain geno- 
types of the lL-10, bcl-2, and Fas-L, short tandem repeat 
sequences (microsatellites) within the noncoding regions of 
these genes were identified and used as intragenic markers. 6n 
Tlie Fas-L (TG),, tandem repeal was identified in the 
3'-untranslated region of the gene, -600 basepairs after the 
stop codon, while the IL-10 (CA)„ microsatellite is located 
?il kb 5' to the ATG codon. The CTLA-4 dinucleotide repeat 
begins at bp 642 of exon 3 of the human CTLA-4 gene 65 
(Polymeropoulos, ct al. Nucleic Acids Res 1991:19:4018), 
and the bcl-2 (AC)„ microsatellite is located 570 bp 5' to the 
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ATG codon. (The IL-10, bcl-2, and Fas-L gene sequences 
can be found using Genomic Data Base accession numbers 
X78437, X51898, and GenBank number U08137, 

respectively). 

To amplify or image these loci, PCR was performed as 
follows: unique oligonucleotide sequences flanking each 
microsatellite were designed as primers, one of which was 
labeled with a fluorescent dye and used in the polymerase 
chain reaction (PCR). The oligonucleotides flanking the 
IL-10 (CA)„ repeat were the 5' primer 5'-GCA ACA CTC 
CTC GTC GCA AC-3' (SEQ ID N0:1) and the 3' primer, 
tagged with the fluorescent dye 6FAM, 5'-CCT CCC AAA 
GAAGCC TTA GTA G-3' (SEQ ID N0:2), The oligonucle- 
otides flanking the bcl-2 (AC), repeat were the 5' primer, 
tagged with the fluorescent dye TET, 5'-CGT GTA CAC 
ACT CTC ATA CAC GGC T-3' (SEQ ID N0:3) and the 3' 
primer 5'-GGG AGG GTG CGC CAT GAA AA-3' (SEQ ID 
N0:4). The oligonucleotides flanking the Fas-L (TG)„ repeat 
were the 5' primer, tagged with the fluorescent dye 6FAM, 
5 -CA C IT Cr AAA FGC ATA TCC TGA GCC-3 (SEQ ID 
N0:5) and the 3' primer 5'-TGTCAG GAA GCA TTC AAA 
ATC TTG ACC A-3' (SEQ ID N0:6). 

For CTLA-4, we used an (AT),,, microsatellite marker 
previously described (Polymeropoulos, et al. Nucleic Acids 
Res 1991:19:4018). The oligonucleotides flanking the 
CTLA-4 (AT),, repeat were the 5' primer, tagged with the 
fluorescent dye TET, 5'-GCC ACT GAT GCT AAA GGT 
TG-3' (SEQ ID NO: 7) and the 3' primer 5'AAC ATA CGT 
GGC TCT ATG CA-3' (SEQ ID N0:8). 

PCR amplification was carried out using 40 ng of 
genomic DNA. The reaction conditions consisted of 0.5 /<M 
of each primer (labeled and unlabeled), 10 mM Tris HCl, pH 
8.8, 50 mM KCl, 1.5 mM Mga^, 50 //M of each dNTP, and 
0.2 imits of Taq polymerase. For IL-10, the samples were 
processed through 30 cycles of 30 seconds at 94** C, 30 
seconds at 57** C, and 30 seconds at 72** C. For Cn..A-4 the 
conditions were 30 seconds at 94° C, 120 seconds at 55° C, 
and 30 seconds at 72° C. 

A "touchdown" P(?R asvsay for Fas-L and bcl-2 polymor- 
phism was performed to circumvent spurious priming during 
amplification. The initial annealing tempera nire was 66° C; 
subsequent annealing temperatures were decreased by 1° C. 
every cycle to a "touchdown" annealing temperature of 55° 
C, at which 30 cycles of 1 minute at 94° C, 1 minute at 55° 
C, and 1 minute at 72° C. were performed. 

Aliquots of the PCR product were electrophoresed on a 
377 Prism ABI sequencer (Applied Biosystems, Foster City, 
Calif.), and the fluorescent signal was recorded and analyzed 
by the Genescan software (Applied Biosystems). Different 
fluorescent dyes were plotted separately, and the sizes of the 
fluorescent peaks were estimated in basepairs by reference 
to the in-lane size standard Tamra 500 (Applied 
Biosystems). MicrosateUite alleles were classified automati- 
cally according to their size using the Genotyper software 
(Applied Biosystems). For quality control to ensure repro- 
ducibility of allele assignments between gels, 1 lane in each 
gel was loaded with a sample that had previously been 
genotyped. Each lane of the sequencing gel was loaded with 
the internal size marker labeled with Tamra 500. In addition 
to the automated allele calling, we performed manual sur- 
veillance of every genotype. 

Although Example 2 makes use of PCR amplification to 
determine sequence length polymorphisms, one of skill in 
the art can readily identify other methods for the purpose of 
identifying disease-specific alleles. Single point mutations 
can also be readily identified using a number of techniques 
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well known to those having ordinary skill in the art. 
Examples of such methods to identify small allelic differ- 
ences include FISH (Fluoescence In Situ Hybridization, 
RFLP (Restriction fragment length polymorphism), TGGE 
(temperature gradient gel electrophoresis) and SSCP (single- 
strand conformation polymorphism), each of which can be 
used to identify diflerences in ONA or RNA. Pure hybrid- 
ization methods, such as Southern blotting or DNA chip 
technology, can also be used. Alternatively differences in the 
protein product could be imaged or identified using such 
techniques as Western blotting, ELISA, or even enzymatic 
assays. 

EXAMPLE 3 

Statistical Analysis 

Associations between Ux:i and the presenc*e of SLE were 
tested by fitting a logistic regression model to the data. 
Genotypes at each locus were coded assuming a multipli- 
cative model for allelic effects. Under this model, the odds 
ratio for a person with alleles a, and ay is given by e*' c*", 
where by and by are regression coef&cicnts corresponding to 
a, and ay, respectively. For each locus, alleles that occurred 
in <3 subjects were eliminated from the analysis. The 
likelihood ratio test was used as a global test of association 
between each locus and the presence of SLE. 

Pairwise interactions between IL-10, Fas-L, and bcl-2 
alleles were modeled using a departure from a multiplicative 
model for the cx)rresponding joint locus effects. At each 
locus, a genotype for each subject was coded based on the 
presence or absence of at least 1 copy of the corresponding 
high-risk allele. Using a logistic regression model, the 
likelihood ratio test was used to determine whether each 
interaction significantly improved the model fit compared 
with a model including only the main effects on the 2 
component loci. 

A significance level of 0.05 was used in all global testing. 
A bonferroni adjustment was used in determining the sig- 
nificance of individual alleles. P values are reported for all 
tests so that the reader may independently assess statistical 
significance 

EXAMPLE 4 

Association of SLE with Apoptotic Markers 

Highly polymorphic short tandem repeat sequences 
(microsatellites) within the noncoding regions of the Fas-L, 
bcl-2, and IL-10 genes were identified and characterized as 
part of the present study, and were used as markers (see 
EXAMPLE 1). The polymorphism information content 
scores were 0.72 for IL-10, 0.47 for bcl-2, 0.59 for Fas-L, 
and 0.83 for CTLA-4. 

The allelic distribution of these microsatellites was deter- 
mined in several distinct ethnic populations, including Cau- 
casian Americans, African Americans, Chinese Americans, 
and Mexican Americans, and showed a significant variation 
among these ethnic groups. For example, Table 1 illustrates 
significant variation in bcl-2 allele frequencies among nor- 
mal individuals belonging to 4 major ethnicities in the US. 
The global likelihood ratio, testing for differences in allelic 
distribution at bcl-2 among the 4 American populations, was 
X2=149.7 (degrees of freedom [df]«18, P=0.001). Similar 
ethnic variation in marker allele frequencies was found in 
the other genes tested in this study. The allele frequencies 
observed in control populations conform to Hardy-Weinberg 
expectations. 
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TABLE 1 

Allele distribution of bcl-2 microsateilite 
in various American populations* 



Frequency 



15 



Allele 


CA 


AA 


MA 


ChA 


(bp) 


(2n - 160) 


(2n - 172) 


(2n = 440) 


(2n - 100) 


187 






0.002 




191 


0.063 


0.169 


0.177 


0.260 


193 


0.025 


0.081 


0.048 


0.030 


195 


0.831 


0.430 


0.700 


0.470 


197 


0.025 


0.047 


0.029 


0.120 


199 


0.006 


0.041 


0.009 




201 


0.031 


0.180 


0.029 


0.120 


203 


0.019 


0.047 






207 




0.005 







•2n = number of chromosomes scored to determine allele frequencies; CA 
= Caucasian Americans; AA - African Americans; MA - Mexican Ameri- 
cans; ChA - Chinese Americans; bp - basepairs. 



Since SLE itself occurs at a higher frequency in certain 
ethnic populations than in others, an association between the 
disease and a gene marker might occur as a statistical artifact 
in the mixed population. To minimize this potential problem 

25 of population stratification, we decided to focus the study on 
one ethnic population in detail. We focused on Mexican 
Americans since they comprise the majority of SLE patients 
in our center. The data presented below were obtained from 
158 Mexican American SLE patients and 223 ethnically 

3Q matched control subjects. Selected chnical characteristics of 
the SLE patients in the study are shown in Table 2. Both 
cohorts (SLE patients and control subjects) were not sig- 
nificantly different in age and sex distribution. 

TABLE 2 



Selected clinical characteristics of the study population* 



45 





SLE patients 


Control subjects 


Oiaraclerislic 


(n = 158) 


(n = 223) 


Age, mean ± SD years 


34.2 ± 11.6 


35.4 X 12.7 


Female, % 


90.5 


86 


ANA positive, % 


100 




Anti-dsONA positive, % 


61 




Renal involvement, % 


35 




CNS involvement, % 


9 





*SLE s systemic lupus erythematosus; ANA » antinuclear antibodies; anti- 
dsDNA = anli-double-stranded DNA antibodies; CNS = central nervous 
system. 



The allelic distributions of microsateilite markers of the 
5Q bcl-2, IL-10, and Fas-L genes in SLE cases and in ethnically 
matched controls are summarized in Table 3. Associations 
between these loci and the presence of SLE were tested by 
fitting a logistic regression model to the data (see 
EXAMPLE 1). 

55 Bcl-2 We identified 9 distinct alleles of the bcl-2 gene; the 
most frequent allele in the controls was 195-bp long (Bcl- 
2^95). The global likelihood ratio statistic, which tests for a 
difference in allelic distribution at bcl-2 between cases and 
controls, was XM4.95 (df=5, P=0.0001), indicating a defi- 

60 nite association between the bcl-2 gene and SLE 

IL-iO Regarding the IL-10 gene, 10 distinct alleles were 
found in Mexican Americans. The most common allele in 
the control population was 125-bp long (IL-10i25)- Th^ *^st 
of association of this gene with SLE gave x^=33.20 (df=8, 

65 P=0.0001), indicating an association. 

Fas-L The Fas-L intragenic marker showed 7 distinct 
alleles; allele 241 was the most common in the control 



6,1( 

7 

population. The global likelihood ratio test statistic for Fas-L 
was x^*23.99 (df=6, P«0.0005), suggesting an association 
between Fas-L and SLE as well. 

TABLE 3 

Allele distribution of the intragenic markers of IL-10, bcl-2, 
and Fas-L in Mexican American SLE patients and normal controls* 

IL-IQ allele frequency bcl-2 allele frequency Fas-^ allele frequency 

Con- Con- Con- 





Cases 


trols 




Cases 


trols 




Cases 


trols 


Allele 


(2n = 


(2n = 


Allele 


(2n = 


(2n = 


Allele 


(2n - 


(2n- 


(hp) 


316) 


440) 


(bp) 


312) 


440) 


(bp) 


298) 


402) 


121 


0.003 


0,009 


187 




0,002 


233 


0.013 


0.003 


123 


0.025 


0.050 


189 


0,006 




235 


0.013 


0.008 


125 


0.363 


0.493 


191 


0.187 


0.177 


237 


0.024 


0.003 


127 


0.199 


0.081 


193 


0.135 


0.048 


239 


0.289 


0.143 


129 


0.107 


0.066 


195 


0.548 


0.700 


241 


0.527 


0.640 


131 


0.067 


0.064 


197 


0.042 


0.029 


243 


0.128 


0.179 


133 


0.136 


0.127 


199 


0.013 


0.009 


245 


0.007 


0.022 


135 


0.073 


0.098 


201 


0.048 


0.029 








137 


0.022 


0.006 


203 • 


0.026 










139 




0.004 















'2n a number of chromosomes scored to determine allele frequencies. 
SLE » systemic lupus erythematosus. 



CTLA-4 The CTLA-4 marker, however, showed no asso- 
ciation with SLE. As shown in Table 4, the CrLA-4 marker 
had 19 distinct alleles in the Mexican American population. 
The likelihood ratio test result between cases and controls 
was x2-19.5 (df«13, P-0.1074). (Five alleles of the CTT.A-4 
occurred so rarely in the data set that accurate estimates of 
their odds ratios could not be calculated. These alleles were 
left out of the analysis.) 

To further investigate the significant associations, we 
performed additional analyses to determine which allele(s) 
of bcl-2, Fas-L, and IL-10 were associated with SLE. Table 
5 summarizes the odds ratio (OR) and 95% confidence 
intervals (95% CI) for the effect of each allele relative to a 
baseline allele. The Bcl-l^gg and Bcl-22oi alleles were 
associated with increased odds of developing SLE (OR 5.61, 
PoO.OOOl and OR 3.15, P=0.0()6 per allele copy, 
respectively, compared with Bcl-lj^j). 

With regard to IL-10, only the IL-lO^j? was associated 
with increased odds of developing SLE (OR 2.81 per allele 
copy, as compared with IL-lOjjs, P=0.0001). The Fas-Ljag 
allele was associated with increased odds of developing SLE 
(OR 1.69 per allele copy), as compared with the Fas-L24i 
allele (P«0.001). As expected, the CTLA-4 gene showed no 
specific allele association with SLE (Tabic 5). 

EXAMPLE 5 

Synergistic Association of 11-10 and Bcl-2 Alleles 

We next explored the possibility that synergistic effects 
between these loci may increase the risk of developing SLE. 
To this end, a departure from a multiplicative model for 

corresponding allelic effects was tested. To minimize the 
number of tests, we focused on single high-risk allele al each 
locus: allele 193 at bcl-2, allele 127 al IL-10, and allele 239 
at Fas-L (see Table 5). The interaction tests are summarized 
in Table 6. We found no significant interaction between 
IL-10 and Fas-L, or between Fas-L and bcl-2. However, 
surprisingly there was significant interaction between the 
11-10,27 allele and the Bcl-2,93 allele (P==0.004). Of 23 
subjects that carried both the Bcl-2_i93 and IL-10j27 alleles, 
22 had SLE. While a person carrying either the IL-10j27 or 
the Bcl-2i93 allele only had an OR of ~2, a person carrying 
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both the IL-IO127 and the Bcl-2,93 susceptibility alleles 
together had an OR of 40.71 (Table 7). 

TABLE 4 

5 

Allele dislribulion of CTLA-4 in Mexican 
American SLE patients and normal controls. 

Allele Cases Controls 



10 (bp) (2n-250) (2n - 446) 



15 



20 



25 



30 



35 



40 



55 



60 



88 


0.564 


0.581 




94 


0.004 


0.006 




96 




0.006 




102 


0.012 


0.004 




104 


0.076 


0.058 




106 


0.188 


0.222 




108 


0.036 


0.027 




110 


0.036 


0.009 




112 


0.008 


0.004 




114 


0.008 


0.006 




116 




0.004 




118 


0.012 


0.070 




120 


0.016 


0.002 




122 


0.008 


0.004 




124 


0.016 


0.012 




126 


0.008 


O.OIl 




128 


0.004 


0.006 




130 


0.004 


0.020 




132 




0.006 




''2n > number of chromosomes scored to determine allele frequencies. 


SLE - systemic lupus erythematosus. 








TABLES 






Association between SLE and speciftc alleles 




of bcl-2, IL-10, and Fas-L, but not CrLA-4* 






Allele ORt 


95% CI 




bcl-2 


Baseline (195) 1.00 








191 1.59 


1.06, 2.37 


0.024 




193 5.61 


2.99, 10.53 


0.0001$ 




197 1.75 


0.72, 4.23 


0.215 




199 1.42 


0.39, 5.24 


0.596 




201 3.15 


1.38, 7.17 


0.006* 


IL-10 


Baseline (125) 1.00 








121 0.43 


0.04, 4.02 


0.455 




123 0.63 


0.27, 1.50 


0.296 




127 ZSi 


1.78, 4.44 


0.0001* 




129 1.96 


1.14, 3.38 


0.015 




131 1.28 


0.71, 2.29 


0.416 




133 1.50 


0.96, 2.36 


0.077 




135 i.ri 


0.64, 1.93 


0.716 




137 3.73 


0.98, 14.22 


0.054 


Fas-L 


Baseline (241) 1.00 








233 5.77 


0.63, 52.9 


0.121 




235 2.03 


0,44, 9.33 


0.365 




237 4.23 


0.76, 23.67 


0.100 




239 1.69 


1.23, 2.33 


0.001$ 




243 0.90 


0.59, 1.38 


0.628 




245 0.44 


0.10, 1.92 


0.272 


CrLA.4 


Baseline (88) 1.00 








94 0.71 


0.11, 4.67 


0.724 




102 3.18 


0.61, 16.65 


0,170 




104 1.12 


0.62, 2.02 


0.704 




106 0.72 


0.50, 1.03 


0.070 




108 1.85 


0.88, 3.90 


0,106 




110 3.06 


0.97, 9.67 


0.056 




112 1.28 


0.20, 8.31 


0.793 




114 5.57 


0.60, 51.41 


0.129 




118 1.19 


0.23, 6.16 


0.829 




122 2.66 


0.24, 29.93 


0.428 
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TABLE 5-continued 



Association between SLE and specific alleles 
of bcl-2, [L-10, and Fas-L, but not CrLA-4* 



Allele 


ORt 


95% CI 




124 


132 


0.38, 4.64 


0.662 


126 


0.61 


0.15, 2.51 


0.492 


128 


1.18 


0.16, 8.62 


0.865 



"Each alicic of the lL-10, bcl-2. Pas-L and CLTA-4 loci was compared 
with a baseline allele of the corresponding locus. The most common allele 
in the control group for a given locus was chosen as the baseline allele. 
Shown are the odds ratios (OR) and the 95% confidence intervals (CI) of 
the association between systemic lupus erythematosus (SLE) and various 
alleles of the 4 loci compared with baseline. 15 
tWald test, testing Hq: OR - 1 for each allele, compared with baseline, 
f Significant at the 0.05 level after Bonfcrroni adjustment, to control the 
T^c I error rate across multiple comparisons within a locus. 



TABLE 6 





Tests for interaction between loci* 




Locus 1 


Locus 2 




P 


IL-10 (127) 


Fas-L (239) 


0.8t 


037 


IL-10 (127) 


bcl-2 (193) 


8.n 


0.004 


Fa.s-L (239) 


bcl-2 (193) 







"'Dxe likelihood ratio test was used to determine whether each interac- 
tion significantly improved the fit compared with a model including only 
the component main eiiccts of the 2 loci. There were not enough cases 
and controls carrying high-risk alleles at both bcl-2 and Fas-L to permit 30 
estimation of an interaction between these loci. 
tUkelihood ratio X^ for Hg: no interaction effect. 
:l:Not applicable: insufficient data to calculate likelihood ratio X^. 

Taken together, the data presented show a novel associa- 
tion between 3 genes involved in apoptosis, bcl-2, Fas-L, 35 
and IL-10, and SLE. CTLA-4 did not exhibit an association 
with SLE. Furthermore, surprisingly, we have demonstrated 
a synergistic effect between the susceptibility allele 193 of 
the bcl-2 gene and the susceptibility allele 127 of the IL-10 
gene in determining disease susceptibility. 

TABIX7 



Synergistic effect of IL-10 and bcl-2 lock on SLE* 

Sample Size 45 



ILIO 


bcl-2 


Controls 


Cases 


OR 


95% CI 


x/x 


y/y 


161 


87 


1.00 




127/x, 127/127 


y/y 


31 


27 


1.61 


0.90, 2.87 


x/x 


193/y 


IS 


20 


2.06 


1.03, 4.09 


127/x, 127/127 


193/y 


1 


22 


40.71 


5.40, 307.2 



50 



"The odds ratios (ORs) shown are based on the parametric estimates in a 
logistic model, x indicates any allele other than 127 for IL-10; y indicates 
any allele other than 193 for bcl-2. SLE * systemic lupus erythematosus; 

95% CI = 95% confidence interval. 

55 

In a case-control study investigating associations between 
a disease and one or more genes, there is the potential for 
bias in odds ratio estimates due to ethnic confounding, 
commonly called population stratification. Depending on the 
relationship between an ethnic con founder and the disease, 60 
the gene-disease odds ratio may either be po.si lively or 
negatively biased. In practice, it is impossible to determine 
the direction of bias unless the confounding variable(s) can 
be directly measured and controlled for in the analysis. In 
this study, we minimized the problem by obtaining both 65 
cases and controls from the same ethnic group (Mexican 
Americans), with the additional requirement that the mater- 
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nal and paternal grandparents of both cases and controls 
must have been bom in Mexico. 

The markers we used are short tandem repeat sequences 
located in the noncoding regions of their respective genes. 
We relied on principles of linkage disequilibrium in our tests 
of association and in the corresponding inference that the 
genes as a whole might play a role in the SLE disease 
process. Linkage disequilibrium is valid over small genetic 
distances (within 1 or 2 centimorgans), which obviously 
covers the intrageneic ranges of the genes in the study. In the 
future it is likely to be found that such sequences arc 
functionally relevant to the expression and biologic proper- 
ties of these gene products. 

Whereas the IL-10 and the bcl-2 genes are both located on 
chromosome 1 in the mouse, in the human, they reside on 
separate chromosomes; IL-10 is on chromosome Iq31-q32, 
and the bcl-2 gene is on 18q21. Therefore, these genes are 
not in linkage disequilibrium and the appearance of IL-10 
susceptibility alleles together with bcl-2 susceptibility alle- 
les in SLE palienLs represents a true synergism. 

EXAMPLE 6 

Application of the Test to Other Ethnic Groups 

The identification of disease-associated alleles for SLE in 
a Mexican American population is a clear indication that 
they will be present in other ethnic groups. However, the 
specific disease-associated allele may differ. For example 
the Caucasian and Mexican American population share 
80-90% similar genetic background. It is likely that they 
will share disease-associated alleles. However, other ethnic 
groups may have different disease-associated alleles. 
Therefore, the test for genetic predisposition in other ethnic 
groups would be as follows: 

A test group and a control group is identified. A PGR is 
performed on each of the apoptotic genes, bcl-1, IL-10, 
Fas-L, and CTLA-4 using the primers as in Example 2. The 
size of the PCR products is determined. Patients with SLE 
are compared to a control group to determine the disease- 
associated allele (by size or sequence). The test involves 
identifying the presence of that allele for at least two and up 
to four of the apoptotic genes. 

Turner et al (Eur J Immunogenel 1997:24:1-8) identified 
a single basepair polymorphism at -1082 in the promoter 
region of the human IL-10 gene which constitutes a G-to-A 
substitution. Production of IL-10 following concanavalin A 
stimulation of peripheral blood lymphocytes from individu- 
als carrying a G at position -1082 was significantly 
increased compared with those with an A at that position. 
Tht IL-10 dinucleotide marker used in the present study is 
located within 50 basepairs of the -1082 G/A polymor- 
phism. It is likely, therefore, that there is linkage disequi- 
librium between the 2 polymorphisms. Without wishing to 
be bound by the hypothesis, it is likely that these polymor- 
phisms directly affect transcription factor binding and rates 
of transcription. 

EXAMPLE 7 

Application to Other Autoimmune Diseases 

Our data on the interaction between bcl-2 and IL-10 
underscore the importance of genes that regulate apoptosis 
in autoimmunity. Transgenic mice that over-express bcl-2 in 
B lymphocytes exhibit polyclonal expansion and extended 
survival in vitro. After a few months, these mice develop 
autoimmune syndromes resembling SLE, including the 
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appearance of antihistone and anti-Sm autoantibodies and 
immune complex-mediated nephritis. Recent studies in SL£ 
patients suggest that bcl-2 expression is elevated in both B 
and T lymphocytes. 

Therefore, it can be easily envisioned that a comparable ^ 
lest for other autoimmune diseases would follow easily from 
the above test for SLE. Different autoimmune diseases share 
susceptibility regions on the chromosomes. Therefore, a test 
for particularly Thyroid autoimmunity syndromes such as 
Graves disease, insulin dependent diabetes mellitis, inflam- 
raatory bowel disease, rheumatoid arthritis and other 
arthritidies would be apparent from the SLE test. 

The test for other autoimmune diseases would be as 
follows: A test group and a control group is identified. A 
PGR is performed on each of the apoptotic genes, bcl-1, 
IL-10, Fas-L, and CTLA-4 using the primers as in Example 
2. The size of the PGR products is determined. Patients with 
SLE are compared to a control group to determine the 
disease-associated allele (by size or sequence), 'llie test 
involves identifying the presence of that allele for at least 
two and up to four of the apoptotic genes. Alternatively 
different primers could be used for the PGR. 

In addition, the test for apoptotic susceptibility loci could 
be administered with other tests for autoimmunities to get a ^5 
more definitive diagnosis or test for predisposition. The 
disease-associated allele would again have to be identified. 

Regarding IL-10, elevated levels of this cytokine are 
found in SLE patients. In addition, IL-10 prevents the 
spontaneous death of human splenic B cells in vitro, an 30 
effect that is abolished by neutralizing anti-IL-10 antibody. 
IL-10 inhibits apoptotic cell death in human T cells starved 
of IL-2 and promotes the in vitro survival of F lymphocytes 
from patients with infectious mononucleosis that, otherwise, 
are destined to die by apoptosis. These findings are of 
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importance because the continuous administration of IL-10 
to lupus-prone (NZBxNZW)F, mice accelerated, and neu- 
tralizing anti-IL-10 antibody delayed the onset of autoim- 
munity. Furthermore, the protective effect of IL-10 against B 
cell death is associated with an increased expression of 
bcl-2. Our data on the synergistic effect of IL-10 and bcl-2 
in human SI..E provides a genetic ba.sis for these observa- 
tions. The date are consistent with the notion that the 
maintenance of a high-level anti-apoptotic stale in lympho- 
cytes contributes to the pathogenesis of SLE by sustaining 
the rate of production of autoreactive antibodies. 

The distal portion of human chromosome 1 (q41-q42) has 
been recently shown to contain an SLE susceptibility gene. 
Although the IL-10 gene resides on the distal portion of 
human chromosome 1, its exact location is proximal to the 
q41 region and, therefore, IL-10, but not the q41-q42 region, 
is closer to the chromosomal interval previously mapped in 
lupus-prone mice. The recent identification of the q41-q42 
susceptibility region on chromosome 1, together with our 
data, identifies a presently unknown, SLE susceptibility 
gene. 

Gonclusion 

In summary, given the anti-apoptotic nature of bcl-2 and 
under certain conditions, IL-10, our data further support the 
notion that inappropriate elimination of autoreactive lym- 
phocytes is an important event in the development of SLE 
and other autoimmunities. We demonstrate here for the first 
time that a specific combination of 2 distinct genes that 
regulate apoptosis identifies a human predisposition to an 
autoimmune disease. In addition we provide a method for 
determining genetic predisposition to systemic lupus eryth- 
rematosus and other autoimmune disea.ses by genotyping 
11-10, bcl-2. Fas ligand and other apoptotic genes. 



SEQUENCE LISTING 



<160> NUMBER CP SEQ ID NOS : 8 

<:210> SEQ ID NO 1 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PGR primer to IL-10 micro satellites. 
<400> SEQUENCE: 1 

gcaacactcc tcgtcgcaac 20 



<210> SEQ ID NO 2 
<211> LENGTH: 22 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PGR primer to IL-10 raicrosatellite 

<400> SEQUENCE: 2 

cctcccaaag aagccttagt ag 22 



<210> SEQ ID NO 3 
<211> LENGTH: 25 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 
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<223> OTHER INFORMATION: PGR primer to bcl-2 micros atel lite. 
<400> SEQUENCE: 3 

cgtgtacaca ctctcataca cggct 25 

<210> SEQ ID NO 4 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer to bcl-2 microsatellite. 
<400> SEQUENCE: 4 

gggagggtgc gccatgaaaa 20 

<210> SEQ ID NO 5 
<211> LENGTH: 25 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer to Fas-L microsatellite. 
<400> SEQUENCE: 5 

cacttctaaa tgcatatcct gagcc 25 



<210> SEQ ID NO 6 
<211> LENGTH: 28 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer to Fas-L microsatellite. 
<400> SEQUENCE: 6 

tgtcaggaag cattcaaaat cttgacca 28 



<210> SEQ ID NO 7 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer to CTLA-4 microsatellite. 

<300> PUBLICATION INFORMATION: 

<301> AUTHORS: Polymsropoulos, et al. 

<303> JOURNAL: Nucleic Acids Research 

<3 04> VOLUME: 19 
<305> ISSUE: 1991 
<306> PAGES: 4018 

<400> SEQUENCE: 7 

gccagtgatg ctaaaggttg 20 



<210> SEQ ID NO 8 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer to CTLA-4 microsatellite. 

<30G> PUBLICATION INFORMATION: 

<301> AUTHORS: Polymeropoules, et al. 

<303> JOURNAL: Nucleic Acids Research 

<304> VOLUME: 19 

<305> ISSUE: 1991 

<306> PAGES: 4018 

<400> SEQUENCE: 8 



aacatacgtg gctctatgca 



20 
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What is claimed is: 

1. A method for determining predisposition to an autoim- 
mune disease in a patient, comprising: 

a) obtaining a sample containing genetic material from 
said patient; and 

b) determining whether alleles associated with suscepti- 
bility to said autoimmune disease are present in both 
IL-10 and bcl-2 loci in said sample, wherein the pres- 
ence of both said alleles indicates that said patient has 
a predisposition to said autoimmune disease. 

2. The method of claim 1, wherein the determining step 
comprises amplification of said genetic material. 

3. The method of claim 2, wherein the amplification 
makes use of a primer specific for said allele associated with 
susceptibility to said autoimmune disease. 

4. The method of claim 1, wherein the determining step 
comprises hybridization with a probe specific for said allele 
associated with susceptibility to said autoimmime disease. 

5. The method of claim 1, wherein the IL-10 gene is 
amplified with primers comprising the sequences of SEQ 10 
N0:1 and SEQ ID N0:2. 
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6. The method of claim 2 wherein a bcl-2 disease-specific 
allele is amplified with primers comprising the sequences of 
SEQ ID N0:3 and SEQ ID N0:4. 

7. The method of claim 1 wherein the allele associated 
with susceptibility to said autoimmune disease is identified 
by its size. 

8. The method of claim 1 wherein the alleles associated 
with susceptibility to said autoimmune disease are IL-10 
(127) and bcl-2(193) and wherein presence of both alleles 
indicates a greater likelihood of predisposition to said 
autoimmune disease than presence of cither allele alone. 

9. The method of claim 1 wherein the autoimmune disease 
is selected from the group consisting of; systemic lupus 
erythematosis, thyroid autoimmunity syndromes, insulin 
dependent diabetes mellitis, inflammatory bowel disease, 
rheumatoid arthritis and other arthrilidies. 

10. The method of claim 9 wherein the disease is systemic 
lupus erythematosis, 

* « * 4> * 
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I, J. Christopher Grimaldi, declare and say as follows: 

1 . I am a Senior Research Associate in the Molecular Biolo^ Department of 
Genentech, Inc., South San Francisco, CA 94080. 

2. I joined Gaientech in January of 1999. From 1999 to 2003, 1 directed flie Cloning 
Laboratoiy in the Molecular Biology Department. During this time I directed or performed 
numerous molecular biology techniques including qualitative Polymerase Chain Reaction (PCR) 
analyses. I am currently involved in, among other projects, the isolation of genes coding for 
membrane associated proteins which can be used as targets for antibody therapeutics against 
cancer. In connection with the abOve-identified patent application, I personally performed or 
directed the semi-quantitative PCR analyses in the assay entitled "Tumor Versus Normal 
Differential Tissue Expression Distribution" which is described in EXAMPLE 18 in the 
specification that were used to identify differences in gene expression between tumor tissue and 
their normal counterparts. 



DjBarSir: 



3. My scientific Cimiculum Vitae, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A), 



4. In differential gene expression studies, one looks for genes whose expression levels 
differ significantly under different conditions, for example, in normal versus diseased tissue 
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Chromosomal aberrations^ such as gene amplificatioii, and chromosomal translocations die 
important markers of specific types of cancer and lead to the aberrant expression of specific 
genes and their encoded polypeptides, including over-expression and under-expression. For 
example, gene amplification is a process in which specific regions of a chromosome are 
duplicated, tbus creating multiple copies of certain genes that normally exist as a single copy. 
Gene under-expression can occur when a gene is not transcribed into mRNA. In addition, 
chromosomal translocations occur when two different chromosomes break and are rejoined to 
each other chromosome resulting in a chimeric chromosome which displays a different expression 
pattern relative to the parent chromosomes. Amplification of certain genes such as Her2/Neu 
[Singleton et a/., PathoLAmiL, 27Ptl:165-190], or chromosonial translocations such as t(5;14), 
[Grimaldie^a/.,mood,73(8y,2081-2085(1989); Meeker era/.. Blood, 76(2):285-289(1990)] give 
cancer cells a growth or survival advantage relative to normal cells, and might also provide a 
mechanism of tumor cell resistance to chemotherapy or radiotherapy. When the chromosomal 
aberration results in the aberrant expression of a mKNA and the corresponding gene product (the 
polypeptide), as it does in the aforementioned cases, the gene product is a promising target for 
cancer ther^y, for example, by the therapeutic antibody ^proach. 

5 . Com|)arison of gene expression levels in normal versus diseased tissue has 
important implications both diagnostically and therapeutically. For example, those who work in 
this field are well aware that in the vast majority of cases, when a gene is over-expressed, as 
evidenced by an increased production of mRNA, the gene product or polypeptide will also be , 
over-expressed. It is imlikely that one identifies ittcreased mRNA expression without associated 
increased protein expression. This same principle applies to gene under-expression. When a 
gene is under-expressed, the gene product is also likely to be under-e?q)ressed Stated in another 
way, two cell samples which have differing mRNA concentrations for a specific gene are 
expected to have correspondingly different concentration of protein for that gene. Techniques 
used to detect mRNA, such as Northern Blotting, Differential Display, in situ hybridization, 
quantitative PGR, Taqman, and more recentiy Microarray technology all rely on the dogma that a 
change in mRNA will represent a similar change m protein. If this dogma did not hold true then 
these techniques would have little value and not be so widely used. The use of mRNA 
quantitation techniques have identified a seemingly endless number of genes which are 
differentially expressed in various tissues and these genes have subsequently been shown to have 
correspondingly similar changes in their protein levels. Thus, the detection of increased mRNA 
expression is expected to result in increased polypeptide expression, and the detection of 
decreased mRNA expression is expected to result in decreased polypeptide expression. The 
detection of increased or decreased polypeptide expression can be used for cancer diagnosis and 
treatment. 

6. However, even in the rare case where the protein expression does not correlate 
with the mRNA expression, this still provides significant information useful for cancer diagnosis 
and treatment. For example, if over- or under-expression of a gene product does not correlate 
with over- or under-expression of mRNA in certain tumor types but does so in others, then 
identification of both gene expression and protein expression enables more accurate tumor 
classification and hence better determination of suitable therapy. In addition, absence of over- or 
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under-expression of the gene product in the presence of a particular over- or under-expression of 
mRNA is crucial information for flie pxacticing clinician^ For example, if a ^e is over-expressed 
but the corresponding gene product is not significantly over-expressed, the clinician accordingly 
will decide not to treat a patient with agents ttiat target that gene product. 

7. I hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information or belief are believed to be true, and further that these 
statements were made with the knowledge that .willful false statements and the Uke so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that such willful statements may jeopardize the validity of the ^plication or any 
patent issued Aereon. 




By: l{ / / Date: 

Christopher Grimaldi 



^ate: 
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J. Christopher Grimaldi 



1434-36*^ Ave. 

San Francisco, CA 94122 

(415) 681-1639 (Home) 



EDUCATION 



University of California, Berkeley 
Bachelor of Arts in Molecular Biology, 1984 



EMPLOYMENT EXPERIENCE 



SRA 



Genentech Inc., South San Francisco; 1/99 to present 



Previously, was responsible to direct and manage the Cloning Lab. ©irrently focused on 
isolating cancer specific genes for the Tumor Antiigen (TAP), and Secreted T\jmor Protein 
(STOP) projects for (he Oncology Department as well as Immunologically relevant genes for the 
Immunology Department. Directed a lab of 6 scientists focused on a company-wide team effort 
to identify and isolate secreted proteins for potential therapeutic use (iSPDI). For the SPDI project 
my duties were, among other thmgs, the critically fanportant coorfination of the cloning of 
thousands of putative genes, by developmg a smooth process of communication between the 
Bioinformatics, Cloning, Sequencing, and Legal teams. Collaborated with several groups to 
discover novel genes through the Curagen project, a unique differential display methodology. 
Interacted extensively wifli the Legal team providing essential data needed for filing patents on 
novel genes discovered through the SPDI, TAP and Curagen projects. My group has developed, 
implemented and patented high throughput cloning methodologies that have proven to be 
essential for the isolation of hundreds of novel genes for the SPDI, TAP and Curagen projects as 
well as dozens of other smaller projects. 



Scientist DNAX Research Institute, Palo Alto; 9/91 to 1/99 

Involved in multiple projects aimed at understanding novel genes discovered through 
bioinformatics studies and functional assays. Developed and patented a method for the specific 
depletion of eosinophils in vivo usmg monoclonal antibodies. Developed and implemented 
essential technical methodologies and provided strategic direction in the ateas of expression, 
cloning, protein purification, general molecular biology, and monoclonal antibody production. 
Trained and supervised numerous technical.stafF. 



Directed plant-related activities, which included expansion planning, maintenance, safety, 
purchasing, inventory control, shipping and receiving, and laboratory management. Designed 
and implemented the safety program. Also served as liaison to regulatory agencies at the local, 
state and federal level. Was in charge of property leases, leasehold improvements, etc. 
Negotiated vendor contracts and directed the purchasing department Trained and supervised 
personnel to carry out the above-mentioned duties. 



Facilities 
Manager 



Corixa, Redwood City; 5/89 - 7/91. 



SRA University of California, San Francisco 

Cancer Research Institute; 2/87-4/89. 

Was responsible for numerous cloning projects including: studies of somatic hypermutation, 
studies of AIDS-associated lymphomas, and cloning of t(5; 14), t(ll;14), and t(8a4) 
translocations. Focused on the activation of hemopoietic growth factors involved in the t(5;14) 
translocation in leukemia patients.. 



Research 

Technician Berlex Biosciences, South San Francisco; 7/85-2/87. 

Worked on a subunit porcine vaccine directed against Mycoplasma hyopneumoniae. Was 
responsive. for generating genomic Ubraries, screening with degenerate oligonucleotides, and 
characterizing and expressing clones in B. coli. Also constructed a general purpose expression 
vector for use by otfier scientific teams. 
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The t(5;14) Chromosomal Translocation in a Case of Acute Lymphocytic 
Leukemia Joins the Interleukin^S Gene to the Immunoglobulin Heavy Chain Gene 

By J. Christopher Grimaldt and Timothy C. Meeker 



Chromosomal translocations have proven to be important 
markers of the genetio abnormalities central to the patho- 
genesis of cancer. By cloning chromosomal breakpoints 
one can identify activated proto-oncogenes. We have stud- 
ied a case of B-Uneage acute lymphocytic leukemia {ALU 
that was associated with peripheral blood eosinophilia. Thet 
chromosomal translocation tt6:14) (q3l;q32) from this 
sampld was cloned and studied at the molecular level. This 

KARYOTYPIC STUDIES of leukemia and lymphoma 
have identified frequent nonrandom chromosomal 
translocations* Some of these translocations juxtapose the 
immimoglobuiin heavy chain (IgH) gene with important 
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Ff9 1. DMA bfots of the leukemia sample. The restriction 
fragment pattern of normal human DNA (N) and the leukemia 
sample (L) were compared using a human Jh probe. Rearranged 
bands are indicated by arrows. Sample L exhibits a single rear- 
ranged band with both HindWl/EcoHl and 5au3A restriction 
digests. The rearranged bands are less intense than the other 
bands because the majority of calls in the aample represent normal 
bone marrow elements. 



tranislocation joined the immunogtobulin heavy chain Join- 
ing <Jh) region to the promoter region of the interleukln-3 
(IL'3) gene in opposite transcriptional orientations. The 
data suggest that acthfation of the IL-3 gene by the 
enhancer of the immunoglobulin heavy chain gene may play 
a central role in the pathogenesis of this leukemia and the 
associated eosinophilia. 
e 1989 by GrunB & Stratton, Inc. 

protooncogenes, such as o-ntyc and In this way, the 

IgH gene can activate proto-oncogenes, resulting in disor- 
dered gene expression and a step in the development of 
cancer. The investigation of additional nonrandom transloca- 
tions into the IgH locus allows us to identify new genes 
promoting the gcneratioaof leukemia and lymphoma. 

A distinct subtype of acute lymphocytic leukemia (ALL) 
has been characterized by B-lineage phenotype, associated 
eosinophilia in the peripheral blood, and a t{5;l4)(q31;q32) 
chromosomal translocation.'-^ This syndrome probably 
occurs in <\% of all patients with ALL. We hypothesized 
that the cloning of the translocation characteristic of this 
leukemia might allow the identification of an important gene 
on chromosome 5 that plays a role in the evolution of this 
disease. In this report we demonstrate that the interleukin-3 
gene (IL-3) and the IgH gene are joined by this transloca- 
tion. 

MATERIALS AND METHODS 

Sample and DNA blots. A bone marrow aspirate from a repre- 
sentative patient with ALL (LI morphology by French-American- 
British [FAB] a-iteria), peripheral eosinophilia (up to 20,000 per 
microliter with a normal value of <350 per microliter) and a 
t(S;14)(q31;q32) translocation was studied. Using published meth- 
ods, genomic DNA was isolated and DNA blots were made.^ Briefly, 
10 ;ig of high molecular weight (mol wt) DNA were digested using 
an appropriate restriction enzyme and electrophoresed on a 0.8% 
agarose gel. The gel was stained with ethidium bromide, photo- 
graphed, denatured, neutralized, and transferred to Hybond (Amer- 
sham, Arlington Heights, IL). After treatment of the filter with 
ultraviolet light, hybridization was performed. The filter was washed 
to a final stringency of 0.2% saturated sodium citrate (SSC) and 
0.1% sodium lauryly sulfate (SDS) and exposed to film. The human 
Jh probe has been previously reported.* 

Genomic library. The genomic library was made using pub* 
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lished methods.' Approximately 100 Mg of high mol wl genomic 
DNA were partiaUy digested with the Sau3A restriction enzyme. 
Fragments from 9 to 23 Irilobases (kb) in size were isolated on a 
sucrose gradient and Kgatcd into phage EMBL3A (Stralcgene, San 
Diego). Recombinant phage were packaged, plated, and screened as 
previously reported.' 

DNA sequencing. Fragments for sequencing were cloned into 
Ml 3 vectors and sequenced by the chain termination method using 
Sequenase (United Sutes Biochemical, Qeveland).' All sequence 
data were derived from both strands. 

RESULTS 

Wc studied a bone marrow sample from a patient with 
ALL and assodated peripheral eosinophilia. Karyotypic 
analysis showed the characteristic t(5;14)(q31;q32) translo- 
cation. These features define a distinctive subtype of ALL.'-^ 
The leukemic cells were analyzed for cell surface phenotype 
by immunofluorecence. They were pcteitive for Bl (CD20), 
B4. (CI)19). cALLA' (CDIO), HLA-DR, and temuiial 
dcoxynucleotidyl transferase (Tdt), but negative for surface 
immunoglobulin. This phenotyjnc profile describes an imma- 
ture cell from tbo E-Iymphocytic lineage * 

The leukemia DNA was analyzed by Southern blotting for 
rearrangements of the IgH gene. Using a human immuno- 
globulin Jh probe, a single rearranged band was detected by 
EcoRh Hlndill, Sstl, SaulK and EcoKl plus Hindlll 
-resuiction digests, suggesting rearrangement of one allele 
(Fig 1). The immunoglobulin Jh re^on from the other allele 
was presumably tither deleted or in the germline conligura- 
tlon. 

We hypothesized that the t(5;14)(q3l;q32) juxtaposed a 
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growth-promoting gene on chromosome 5 with the immuno* 
globulin Jh region on chromosome 14, Therefore, a genomic 
library was made from the leukemic sample and screened 
with a Jh probe. Fifteen distinct positive clones were isolated 
and screened for the presence of the rearranged Sau'iA 
fragment that was detected by DNA blotting. By this 
analysis, five clones appeared to represent the rearranged 
allele identified by DNA blots. One of these clones (clone no. 
4) was chosen for further study and a detailed restriction 
map was generated. The £coRI, Hindlll/EcoRl and Sstl 
fragments from clone no. 4 that hybridized to the human Jh 
probe were also identical in size to the rearranged fragments 
from the leukemia sample, confirming that clone no. 4 
represented the rearranged leukemic allele. 

Phage clone no. 4 contained 3.7 kb of unknown origin 
joined to the IgH gene in the region of Jh4 (Fig 2). The IgH 
gene from Jh4 to the Cmu region appeared to be m germline 
configuration. Previously,- the gene encoding hematopoietic 
growth factor IL-3 had been mapped to chromosome 5q3 1 so 
it was suspected that clone no. 4 might contain part of this 
gene.*"** When the restriction map of human IL-3 and clone 
no. 4 were compared, they were identical for more than 3 kb 
(Fig 2). 

We confirmed the juxtaposition of the IL-3 gene and the 
IgH gene by nucleic add sequencing of the subcloned 
BstEll/Hpal fragment (Fig 2). The sequence of this frag- 
ment showed no disruption of the protein coding re^on or the 
messenger RN A of the IL-3 gene. The break in the lL-3 gene 
occurred in the promoter region^ 452 base pairs (bp) 
upstream of the transcriptional start site (position 64, Fig 
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3A) The break in the IgH gene occurred 2 bp upstream of GM-CSF maps within 9 kb of IL-3 in the same transcrip- 

the Jh4 region. Between the two breaks, 25 bp of uncertain tional orientation." Using tins information and assuming a 

origin (putative N sequence) were inserted,"-" No sequences simple translocation event in our sample, we can conclude 

homologous to tiie immunoglobuUn heptamer and nonamer that the IL-3 gene is normally more centromeric, and tiie 

could be identified in tiie lL-3 sequence (Fig 3B), Therefore, GM-CSF gene more telomcric on chromosome 5q (Fig 4). 

nucleic add sequencing confirmed the juxtaposition of the Furtiiermore, both are transcribed with their 5' ends toward 

lL-3 gene and the IgH gene. The sequence data dearly the centromere, 

showed that the genes were positioned in opposite transcrip- DISCUSSION 

tional orientations (head-to-head). .1.1 

Available data also aUowed us to determine the normal In this report we have cloned a unique chromosomal 

positions of the 11^3 gene and die GM-CSF gene in relation translocation that appears to be a consistent feature of a rare, 

totiic ccntromereof chromosome 5 (Fig 4). The IgH gene is yet distinct, dinical form of acute leukemia. This transloca- 

known to be positioned with the variable regions toward the tion joined the promoter of the IL-3 gene to the IgH gene, 

telomere on diiomosome I4q,'-" It has also been shown that Except for the altered promoter, the 11^3 gene appeared 

m 5 * GGTGACCAGGGTTCanXSGCCCCAGTAGTCJUUlGTAGTAGAGGXAATTCA 80 



5 ' TACCAGACiU^CTCTCATCTCTTCCAGTGGCCTCCTGGCCACCCACCAG^^ 160 
VATCGTCTGTTTGAGAGTAGACAAGGTCACCGGAG 

^ ********* . • . 

5 • GTAGTCCAGGTGATGGCAGATGAGATCCOkCTOGGCAGGAGGCCaX^GTGAGCTOAOa^ 240 
3 • CATCAGGXCC ACTACCGTCTACTCTAGGGTGACCCGTCCTCCGGAGTCACTCGAC^ 

5 »GG6GT0CTCTCACCTGCTGCCATGCTTCCCATOrCT^ 320 
3 • CWAGGAGAGTGGACGACGGTACGAAGGGTAGAGAGTAGGAGGAACTGTTCTACTTCACT^ 

********* 

5 « TOTCTTGTTTCACTGATCTTCACTACTAGAAAGTCA^TGGATGRATJU^TTACO^ 400 
^HA^S^^CTAGScATGATCOT^ 

5»CAGATA&A(yiTCCTTCCGAaXX:TGCCCavCACCftCCACCTCCC«C^ 480 
3«GTCTATTTCTA«3GAAGQCTCCGGAaMGGTGTGQ3H3GTGeAGG6GGGCGC^ 

S • CACATATAAG<XGGGAGGTTGTTGCCAACTCTTCAGAGCCCCACGAAGGRCCAGAACftAGACAGAGTGCCTCCTGCCGAT gg^ 
3 • CTGTMMTCCGCC<as:<aACAACGQTT6AGAAGTCTCGGGGTGCTK:CTG6TC^ 

5 •CCAAACOTGA6CCGCCTGCCCGTCCTGCTCCTGCTCCAACTCCTG 6«C 
^GG^TTCT^TCGGCGGACGGGCAGGACGAGGACGAGGTTGAG<»CCftGGCGGGGCCT<»GG^ 

S'AACGTCCTTGAAGACAAGCTGGGTTAAC 3' >gg 
3'TTGCAGGAACTTCTGTTCGACCCAATT6 5' 

_ 5 1 TGGCCCCAGTAGTCWUVGTAGTCACATTGTGGGAGGCCCCATTAAGGGGTGCACAAAAACCTGACTC^ 

B IgJM I . ^^r^CTCATCAGTTTCATC AGTGTAACACCCTCCGGGGTAATTCCCCACG£GmSSSGA^^ 

++++++++++++++++++++++ 
B'TGQCCCCAGTAGTCAAAGTAGTAGAGGTAATTCATCATAGCTGCGGATTAGCAGCGTGACCGGCTACCA 

+++^.^.++++++++++++■^■^■+++ 

5 * GGCACCAAGAGATGTGCTTCTCAGAGCCTGAGGCTGAACGTGGATGTTTAGC AGCGTGACCGGCTACCA 

"-3 |.??SgStSctaScgammtctcggactccgact^ 

Fin 3 SeaiNtiM of tt6;14Mq31;q32> breakpoint region. (A) Nucleotide sequence of the BstEII/MpsI fraflment indicated cm Ho 2. 
NucS<5d..Vtnrr.pre8ent *e JM coding region underlined on the coding str-nd Nucleotides 39 to 63 are a p»trt««. N region. The 
sealt^ fronTp^tion W to 6ISB ie that of the germOne IL-3 gene." The IL-3 TATA box WBl tranaoription start {616^ and tatoatten 
mathi^ne «67Ce undoriined. Two proposed regulatory sequences In the promoter are marlcMl by asterteks (positions m and 389). (B) 
S:^™"11elrnce"o,Thr^;(B:1^ region. The ^"llellttS SSl' TS^«nM 

n^ZL underlined. Clone no. 4 la shown with putative N region sequence. J";^^^ * «"« «+» 

denotes the Identical nucleotide between sequences. No heptamer or nonamer to identlftod tn the IW sequence. 
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Rg 4. Diagrain of the translocation. Tha normal chromosome 
Bq31 b shown wwth the GM-CSF gene telomeric to the IL-3 gene In 
the transcriptional orientation shown. On normal chromosome 
14q32 the Vh regions are telomeric The t(6:14)(q31 ;q32) translo- 
cation results in the head-to-head orientation of these genes. 
Symbols are defined in Rg 2. BP. brealcpoint position. 



intact as no deletions, insertions, or point mutations were 
detected by restriction mapping of the entire gene and 
sequencing of part of the gene. The IgH gene has been 
truncated at the Jh4 region, which places the Immunoglobu- 
lin enhancer within 2.5 kb of the IL-3 gene."-" This leads to 
the hypothesis that the enhancer is increasing transcription 
of a structurally normal IL-3 gene. The same mechanism is 
important for activation of the c-myc gene in some cases of 
Biirldtt*s lymphoma." An alternate hypothesis is that the 
elimination of an upstream IL-3 promoter element is crucial 
to the activation of the IL-3 gene. 

The proposed activation of the IL-3 gene suggests that an 
autocrine loop U important for the pathog^esis of this 
leukemia. Over-expression of the 11^3 gene coupled with 
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the presence of the IL-3 receptor in these cdls could account 
for a strong stimulus for proliferation. In this regard^ there 
are data indicating that immature B-lineage lymi^ocytes 
and B-lineage leokemias may express the IL-3 receptor.^'*^ 

An additional feature of this type of leukemia is the 
dramatic eosinophilia, consisting of mature forms. It has 
been hypothesized that the eosinophils do not arise from the 
malignant clone, but are stimulated by the tumor."-^ 
Because of the known effect of IL-3 on eosinophil differentia- 
tion, secretion of high levels of IL-3 by leukemic cells might 
have a role in the eosinophilia in this type of leukemia.'^ 

The data suggest that the recombination mechanism that 
is active in the IgH gene during normal differentiation has a 
role in this translocation."'** This is supported by the break- 
point location at the 5' end of Jh4 and the presence of 
putative N-re^on sequences. On the oth«" hand, no rccombi-. 
nation signal sequence (heptamer and.jiQnamer) was found 
in this^region on diromosome 5, su^estcng that additional 
factors also played a role. Further studies will elucidate the 
mechanism of this and other translocations.. 

In the leukemia we studied, it is possible that the immuno- 
globulin enhancer also activates the GM-CSF gene, since 
this gene is probably positioned only 1 4 kb away (Fig 4). This 
is known to be within the range of enhancer activation.^ The 
interleukin-5 (IL-5) gene maps to chromosome 5q31." 
Deregulation of the IL-5 gene by this translocation would act 
synergistically with 11^3 in the stimuladon of eosinophil 
proliferation and diffwcntiation-" These and other questions 
will be answered by the study of more patient samples; We 
plan to determine whether the t(5;l4)(q31;q32) transloca- 
tion is capable of activating multiple lymphokmes simulta- 
neously and whether they cooperate In the generation of this 
leukemia. 
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The t(5;1«Hq31;q32) translocation from B-lineage Qcute 
iyimphocyttc teuEcemSa with eoslnophitia has been cloned 
iTrotn two leukemia samples. In both cases, this l^nsloca- 
^ion Joined the I9H gene and the lnterleukin-3 CtL-3) gene. In 
oine patient, excess IL^ mRNA was produced by the 
DtDutcemic cells. On the second patient, serum IL-3 levels 
were measured and shown to correlate with disease 

ANUMBBR OF chromosome translocations have been 
associated with human leukemia and lymphoma. In 
many cases the study of these translocations has led to the 
discovery or cliaracterization of proto-oncogenes, such as 
bcl'2, Crobl, and c-myc^ that are located ikdjaceat to the 
transIocationJ-^ It is now widely understood that cancer- 
associated translocations disrupt nearby proto-oncogenes. 

A distinct subtype of acute Eeufceniia is charactenzed by 
the triad of B-lineage immunophcnotype, eosinophilia, and 
the t(5;I4)(q31;q32) translocation.^'* Leukemic cells from 
such patients have been positive for terminal deoxynucleotidyl 
transferase (Tdt), common acute lymphoblastic leukemia 
antigen (CALLA), and CD19, but negative for surface or 
cytoplasmic inmiunoglobulia. In prenous work, we cloned 
the t(5;14) breakpoint from one leukemic sample (Case 1) 
and determined that the IgH and interleukin-3 (11^3) genes 
were joined by this abnormaHty.^ In this report, we extend 
those findings by showing that the t(5;I4)(q31;q32) translo- 
cation &om a second leukemia sample (C^e 2) has a similar 
structure, and we report our study of growth factor expres- 
sion in these patients. 

MATERIALS AND METHODS 

Samples and Southern blots. Case 1 has been described.^'* 
Clinical features of Case 2 have been described m detail.' DNA 
isolation and Southern blotting was done using previously dcs^bed 
methods.' Filters were hybridized with an immunoglobttlin Jh probe, 
a 280 bp BamHllEcoVtl genomic IL-3 fragment, and an IL-3 
cDNApiobe." 

Northern blots. RNA isolation and Northern blotting have been 
described.' Briefly, Northern blots were done by separating 9^8 
total RNA on 1% agarose-formaldehyde gels. Equal RNA loading in 
each lane was confinned by ethidium bromide staining. Blots were 
hybridized with an lL-3 cDNA probe extending to the Xho I site in 
cxon 5» a 720 bp Sst l/Kpn I probe derived from inlron 2 of the IL-3 
gene, a 600 bp Nhe l/Hpa 1 IL-5 cDNA probe, and a 500 bp Pst 
l/Nco I granulocyte-macrophage oolony stimulating factor (GM* 
CSF)cDNA probe. 

Polymerase chain reaction. Primers were designed with BamW 
sites for cloning. One primer hybridized to the Jh sequences from the 
IgH gene (Primer 144:5'-TAGGATCCGACGGTGACCAGGGT), 
and the other hybridized to the region of the TATA box in the IL^3 
gene (Primer 161 : 5'-AACAOGATCCCGCCTTATATGTOCAG). 
Polymerase chain reaction (PCR) (95^ for 1 nunute. 61^ for 30 
seconds, and 72**C for 3 minutes) was done using 500 ng genomic 
DNA and 50 pmol of each primer in 100 mL containing 67 nunol/L 
Tris-HO pH 8.S, 6.7 mmol/L MgClj, 10% dimethyl sulfoxide 
(DM50). no ftg/mL boidne senun albunun (BSA) (fraction V). 



activity. There was no avldence of excess grantslocyte/ 
macrophage coOony stimufatfng factor (GM-CSF) or lUS 
expression. Our data support the formulation that this 
subtype of leukemia may arise in part because of a 
chromosome transDocation that activates the IL-3 gene, 
resulting in autocrine and paracrine growth effects. 
® 1930 by The American Soelsty of Hematology. 

16.6 mmoI/L anunonium sulfate, 1 .5 nmiol/L each dNTP and Taq 
po^erase Q'CTldn-Elmer, Norwalk, CT)." 

Sequencing. Sequencing was done by chain termination in Ml 3 
vectors." As part of this study, we sequenced a subclone of a normal 
. I1/-3 promoter, covering 598 basf^pairs from a Sma I site at position 
— 1240 (with respect to the proposed site of transcription initiation) 
to an Nhe I site at position The plasmxd containing this region 
was a gift from Naoko Arai of the DNAX Research Institute. 

Expression in Cos? ceils. A genomic IL-3 fragment from Case 1 
was doncd into the pXM expression vector.'" Briefly, the ffindlU/ 
Sal I fragment containing the IL-3 gene was subcloned from the 
previously described phage clone 4 into pUC18.* The 2.6 kb 
fragment extending from the Sma I site 61 bp upstream of the 11^3 
transcription start to the Sma I site in the polylinker was cloned into 
the blunted Xho I site of pXM. The negative control construct was 
the pXM vector without insert Plasmids were introduced into Cos7 
cells by dectroporatbn, and supernatant was collected after 48 
hours m culture. 

TFl bioassay, TF-1 cells were passaged in RPMI 1640 suppler- 
mented with 10% heat-inactivated fetal bovine serum, 2 mmol 
L-glutamine. and I ng/mL human GM-CSF." Samples and antibod- 
ies were diluted in this same medium lacking GM-CSF but contain- 
ing peniciltin and streptomycin. A 25 ^L volume of serial dilutions of 
patient serum was added to wells in a flat bottom 96-well microtiter 
plate. Rat anti-cytoldne monoclonal antibody in a volume of 25 /tL 
was added to appropriate wells and preincubated for 1 hour at 37°C. 
Fifty nucrolitcrs of twice washed TF-I cells were added to each well, 
givmg a final cell concentration of 1 x 10* cells per well (final 
volume. 100 fit). The plate was mcubated for 48 hours. The 
remaining cell viability was detemuned metabolically by the colori- 
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Fig 1. Broakpotnt sequences for Case 2. The germiine IgJhS region sequence Cprotetn codtna raalcm anrf m-^^^m 
sequences are underlined) Is on top, the translocation sequence f^m Case 2 «PCR |»^^^ 

Is In the middle, end the germflne IL^ sequence, whfoh we derived from a normal IM done, fe^ SZaT^Z 5^ 1^^^^ 

sequence has the same nucleotide. The sequence documents the head-to^ead Jobifatg of the IL^ and laH asnBlrhJhr^^Z t ^ ^Ht 

gene occurred at posmon -834 (♦). g«i«», i ne Dreakpolnt in tfie IL-3 



metric method of Mosmann nsmg a VMax microtiter plate reader 
(Mdccular Devices* Menlo Park, CA) set at 570 and 650 mn.'* 

Cytokine intmunoassays. These assays used rat monodonal 
anti-cytokine anUliodies (10 /tg/mL) to coat the wells of a PVC 
miccotiter plate. The capture antibodies used were BVD3-6GB, 
JBS1-39D10, and BVD2 23B6, for the n^3, Il>5, and GM-CSF^' 
assays, respectively. Patient sera were th(» added (undiluted and 
diluted 1:2 for IL-3, undiluted for IIy-5, and undiluted and diluted 
1:5 for GM-CSF). The detecting inununoreagents used wmc cither 
mouse antiserum to IL-3 or mlroiodophen^ (NIP)-dcrivatizcd rat 
monoclonal anUbodies JES1-5A2 and BVD2-21CU, specific for 
IL-5 and GM-^SF, respectively. Bound antibody was subsequently 
detected with immunoperoxidase conjugates: horseradish peroxidase 
(HRP)-labcIed goat anti-mouse Ig for 11^3. or HRP-labeled rat (J4 
MoAb) anti-NIP for 11^5 and GM-CSF, The chromogcaic sub- 
strate was 3-3'azmo-bis-benztiuazoline sulfonate (ABTS; Sigma, St 
Louls» MO). Unknown values were interpolated from standard 
corves prepared from dilutions <rf the recombinant factors using 
Softmax software available with the VMAX nucroplate reader 
(Molecular Dences). 

RESULTS 

Leukemic DNA from Case 2 was studied by Southern 
blotting. When digested with the Hindlll restriction enzyme 
and hybridized with a human immunoglobulin heavy chain 
joining region (Jh) probe, a rearranged fragment at approxi- 
mately 14 kb was detected (data not shown), Wheirreprobed 
wth either of two different IL-3 probes,.a rearranged 14 kb 



fragment, oomigraling with the rearranged Jh fragment, was 
identmed. Whealeukemic DNA was digested with HindSn 
ptos £coRI. a rearranged Jh fragmwit was detected at 6 kb 
The IL.3 probes also, identified a comigrating fragment of 
-this size. These experiments indicated that the leukemic 
sample studied was clonal and that a single fragment 
contamcd both Jh and IL-S sequences, suggesting a traaslo- 
cation had occurred. 

To characterize better the joining of the ILO gene and the 
imnaunpglobulin heavy chain (IgH) gene, the polymerase 
Cham reaction (FCR) was used to clone the translocation » 
A Jh ^erand an IL-3 primer were designed to produce an 
amphfied product in the event of a head-to-head transloca- 
tion. While control DNA gave no PCR product. Case 2 DNA 
yielded a PCR-derived fragment of approximately 980 bp 
which was cloned and sequenced. 

The DNA sequence of the translocation clone from Case 2 
confirmed the joining of the Jh region with the promoter of 
the IL-3 gene in a head-to^head configuration {Hg 1) 
Sequence analysis indicated that the breakpoint on chroma-* 
some l4 was just upstream of the Jh5 coding region. The 
breakpomt on chromosome 5 occurred 934 bp upstream of 
the puutive site of transcription initiation of the lL-3 gene. 
We also determined that a putative N sequence of 17 bp was 
inserted between tiie chromosome 5 ind chrombsome 14 
sequences during the translocation event." '* Figure 2 shows 
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fig 2. Relationship of chromosome E breakpoints to the IL-3 oene. This figure shows tha two doned breaknotnta b. 
th« normal IL-3 oene.*-" One breakpolm occurred at position -492 and the other at -934 "arrow!? to bfth Sl^'"^ 
tranilecatlons resulted in a head-to-head Joining of tha IgH gene and tl.e IL-3 gene, leaving the mRNA and prMeb, eoab^J^A^^!!!^ « ! 
gene Intact. Bokm denM« the flv« 0,-3 Mons: rMlrietioii eniyme* are <B| BeaM. (P) l>WI, (H) /««• I, (EjetaHM. andW Xto I. 
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. Rfl 3. Documentation of IL-3 mRNA over-Axprossion. A Northom btot was prepared and hybridizod with a probe for IL-3. Lane 1 
contdned RNA fipom unstimulated peripheral Mood lymphocytes (PBU as a negative control. Lane 2 contahied UNA from PBL ettmurated for 
4 hours wMi concanavaDan A (ConAL and lane 3 contained RNA from PBL stimulated with ConA for 48 hours. As fn the posithre control 
lanes (2 and 3K a 1 kb band was Identified In the leukemic eample from Case 1 Clane 4. tower airowKsui^^ 

IL-3 Bene. In addition, the leukemic sample showed over-expression of an unspliced 2.9 kb n.-3 transcript (lane 4, upper arrow). We 
documented that this represented an unspnoed precursor of the mature t kb transcript by showins that this band hybridised to a Uroba 
ft-om Intron 2 off the iL-3 gene. A similar 2.9 kb band was detect^ in1ane'2« suggesting that an IL-3 mRNA of this size is sometimes 
detectable In normal mitogen-stimulBted celts. Lone 6 through 10 represent RNA from stx samples of B-lineage acute lymphocytic leukemia 
without the t(B;14) trandocatkm, indicating that ordy the sample with the translocation exhibited IU3 over-expresskm. Case 2 cotdd not be 
analyzed by Northern blot because too few cells were available for study. 



the locations of the two cloned breaIq)oints in relation to the 
IL-3 gene. The two cbromosome 5 breakpoints wece sq)a- 
rated by less than 500 bp. 

The genomic structure in Cases 1 and 2 suggested that a 
normal IL-3 gene product was over-expressed as a result of 
the alttfed promoter structurew This would predict that the 
IL-3 gene on the translocated chromosome was capable of 
maidng IL-3 protein. This prediction was tested by express^ 
ing a genomic fragment from the translocated allele of Case 
1 contaimng all five IL-3 exons under the control of the SV40 
promotor/enhancer in the Cos? cell line. Cell supematants 
were studied in a proliferation assay using the factor dq>en- 
dent erythroleukemic cell line, TF-1. The supematants' 
derived from transfcctions using the vector plus insert 
supported TF-l proliferation, while supematants from trans- 
fectlons using the vector alone were negative in this assay 
(datsTfiot shown). Furthermore, the biologic activity could be 
blocked by an antibody to human IL-3 (BVD3-60B). This 
result showed that the translocated allele retained the ability 
to make IL*3 mRNA and protein. 

The level of expression of IL-3 mRNA in leukemic cells 
from Case 1 was assessed. Northern blotting showed that the 
mature IL3 mRNA (approximately 1 kb) and a 2.9 kb 
unspliced lL-3 mRNA were excessively produced by the 
leukemia (Fig 3). The 2.9 kb form of the mRNA is also 
present at low l^els in normal peripheral blood T lympho- 
cytes after mitogen activation (Fig 3). Several B-lineage 
acute leukemia samples without the t(5;14) translocation 
had undetectable levels of IL-3 mRNA in these experiments. 
In addition, although genes for GM-CSF and IL-5 map close 
to the IL-3 gene and might have been deregulated by the 
translocation, no IL-5 or GM-CSF mRNA could be detected 
in the leukemic sample (data not shown).*''^ 

Three serum samples from Case 2 were assayed by 
immunoassay for levels of 11^3, GM-CSF» and IL-5 (Table 
1). Serum IL-3 could be detected and correlated with the 
clinical course. When the patienfs leukemic cell burden was 



highest, the IL-3 level was highest. No serum GM-CSF or 
IL-5 could be detected. 

Since the IL-3 inununoassay measured only immunoreac- 
tivc factor, we confimed that biologically active IL3 was 
present by using the TF-l bioassay. This bioassay can be 
rendered monospecific using appropriate neutializing mono- 
clonal antibodies specific for IL-3, ILp5, or GM-CSF. We 
observed that sera from 1-16-84 and 3-14-84 contained TF-l 
stimulatmg activity that could be blocked with anti-IL-3 
MoAb (BVD3-6G8), but not with MoAbs to IL-5 (JESl- 
39D10) or GM-CSF (BVD2-23B6) (Fig 4; GM-CSF data 
not shown). The amount of ncutralizable bioactivity in these 
two samples correlated very well with the difference in IL-3 
levels obtained by immunoassay for these samples. Ftrther- 
more, the failure to block TF-l proliferating activity with 
either ?Lpti-IL-5 or anti--GM-CSF was consistent with4hp.. 
inability to measure these factors by immunoassay .and 



Table 1 . Peripheral Blood Counts and Growth Factor Levels 
et Different Times in Case 2 

SamptftOata 





11/16/83 


1/16/84 


3/14/84 


Perfpheral blood counts <cell9//iL) 








WBC 


81,800 


116,600 


12,300 


Lymphoblasts 


0 


33.785 


0 


Eosinophils 


46,626 


73,080 


615 


Serum growth factor levels (pg/mL) 








IL-3 


<444 


7^95 


1,051 


givk:8f 


<15 


<1S 


<15 


IL-5 


<50 


<60 


<60 



Peripheral Wood counts from Case 2 et three different time points with 
the corresponding growth fector levels quantified by immunoassBy. The 
patidnt receded chemotherapy between 1/16/84 and 3/14/84 to lower 
his leukemic burden.' No serum samples were available for a similar 
anaTysisofCase 1. 

AbbreviaHon: WBC, white blood c«lls. 
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Fig 4. Bioassay of sentra IL^. Isukamle patient sera were tested for Moactfve IL-a and 0.-6 in tha tf i ^ 
concentration of monoclonal rat anti-IL^ BVDM^Sl or^w^jT^S^^ monospecilic by using a 1 iig/mL final 



indicated that these other myeloid growth factois were not 
detectably drculating in the serum of this patient 

DISCUSSION 

In this report, we have extended our analysis of acute 
lymphocytic leukemia and cosinophilia ajssociated with the 
t(5;14) translocation. In both cases we have studied, we have 
documented the joining of the IL-3 gene from chromosome 5 
to the IgH gene from chromosome 14. The brealqwints on 
chromosome 5 are within 500 bp of each other, suggesting 
that addiUonal breakpoints will be clustered in a small region 
of the IL-3 promotor. The PGR assay we have developed will 
be useful in the screening of additional clinical samples for 
this abnormality. 

The finding of a disrupted 11^3 promoter assodatcd with 
an otherwise normal 11^3 gene implied that this transloca- 
tion might lead to the over-eKfnres^an of a ncn-mal IL-3 gene 
product. In this work, we have documented that this is toie. 
In addition, neither GM-CSF nor 11^5 are over-expressed by 
the leukemic cells. Furthermore, in one patient, serum IL-3 
could be measured and correlated with disease activity. To 
our knowledge, this is the first measurttnent of human IL-3 
in serum and its association with a disease process. The 
measurement of scrum IL-3 in this and other cluiica] settings 
may now be indicated. 



The finding of the IL-3 gene adjacent to a cancer- 
associated translocation breakpoint suggests that its activa- 
tion is important for oncogenesis. It is our thesis that an 
autocrine loop for IL-3 is unportant for the evolution of this 
leukemia?* The excessive IL-3 production that we have 
documented would be one feature of such an autocrine loop. 
The final proof of our thesis must await additional data. In 
particular, from the study of additional clinical samples, it 
will be necessary to document that the IL-3 receptor is 
present on the leukeniic cells and that anti->IL-3 antibody 
decreases proliferation of the leukemia in vitro. 

An important aspect of this work is the suggestion of a 
therapeutic approach for this disease. If an autocrine loop for 
n^3 can be documented m this disease, attempts to lower 
circulating IL-3 levds or block the interaction of IL-3 with 
its receptor may prove useful. Because it is also possible that 
tl» eosinophilia in these patients is mediated by the para- 
crine effects of leukemia-derived IL^S. ^siipjlar interventions 
may improve this aspect of the disease. Antibodies or ^ 
engineered ligands to accomplish these goals may soon be 
available. 
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Clinical and Pathologic Significance of the 
c-erbB-2(H£ft-2/neu) Oncogene 

Timothy P. Singleton and Jolin G. Strid<ler 



The c-erhB-2 oncogene was first shown to have clinical significance in 1987 by 
Slamon et al,''^ who reported that c-er&B-2 DNA amplification in breast carcino- 
mas correlated decreased survival in patients widi metastasis to axillary 
lymph nodes. Subsequent studies, however, of oerbB-Z activation in breast 
carcinoma reached conflicting conclusions about its clinical significance. This 
oncogene also has been r^rted to have cbnical and padiologic implications in 
other neoplasms. Our review summarizes these various studies and examines 
the clinical relevance of c-er2>B-2 activation, which has not been emphasized in 
recent reviews, ^''^^s fjje molecular biology of the c-er6B-2 oncogone has been 
^ctensiveV reviewed^''^-'^ and will be discussed only briefly here. 



BACKGROUND 

The o>er&B-2 oncogene was discovered in the 1980s by three lines of investiga* 
tion. The neu cmcogeue was detected as a mutated transforming gene in 
neuroblastomas induced by ethylnitrosurea treatment of fetal ratsA^^'^*^ The c- 
er&B-2 was a human gene discovered by its homology to the retroviral gene v- 
er6B.®»**'''* HER-2 was isolated by screening a human gnomic DNA library for 
homology with v-erbB.^ When the DNA sequences were determined subse- 
quently, c-erbB-2, ifER-2, and neu were found to represent the same gene. 
Recently, the c-erfcB-2 oncogene also has been referred to as NGL. 

The c-«friB-2 DNA is located on human chromosome 17q21«*-**^ and codes 
for c-erbB^Z mRNA (4.6 kb), which translates c-erbB-2 protein (pl85)* This 
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prolein is a normal component of cytoplasmic membranes. Hie e-erbB'i 
oncogene is homologous with, but not identical to, c-erfcB-1, which is located 
on chromosome 7 and codes for thc cpidcrmal growth factor receptor, ^-^^^Thc c- 
erbB-2 protein is a receptor on cell membranes and has Intracellular tyrosine 
kinase activity and an extracellular binding domain>*°^ Electron microscopy 
with a polyclonal antibody detects oerbB-i immunoreactivity on cytoplasmic 
membranes of neoplasms, especially on microvilli and the non-viUous outer cell 
mcmbranc.^^ In normal colls, immunohistochemical reactivity for <yerh'B'2 is 
frequently presen t at the basolateral membrane or the cytoplasmic membrane's 
brush border. I 
There is experimental evidence that CHer&B-2 protein may be invohred in . 
the pathogenesis of breast neoplasia. Overproduction of otherwise normal c- i 

protein can transform a cell line into a malignant phenotype.^ Abo, 
vihcn the xrtS^t^ni containing an activating point mutation is placed in 
transgenic mice with a strong promoter for increased expression, the mice 
develop multiple independent mammary adenocarcinomas.*^^ In other experi- 
ments, monoclonal antibodies against die neu protein inhibit the growth (in 
nude mice) of a neu-transformed cell line,^^ and immunization of mice with 
neu protein protects them from s:ubsequent tumor challenge with the neu- 
transformed cell Ifne.*^ Some authors have speculated that the use of antago- 
nists for the unknown ligand could be useful in future chemotherapy.^ Further 
review of this experimental evidence is beyond the scope of this article. 

The c-erbB-2 activation most likely occurs at an early stage of neoplastic 
development. This hypodiesis Is supported by the presence of c^bB-2 activa- 
tion m both in situ and invasive breast carcinomas. In addition, studies of 
metastatic breast c*arcinomas usually demonstrate uniform c-er&B-2 activation 
at multiple sites in the same patient, althou]^ c-^B-2 activation has 
rarely been detected in metastatic lesions but not in the primary tumor."*****^ 
Even more rarely, c-er&B>2 DNA amplification has been detected in a primary 
breast carcinoma but not in its lymph node metastasis.^ In patients who have 
bilateral breast neoplasms, both lesions have simUar patterns of c^feB-2 activa- 
tion, but only a few such cases have been studied.*^ I 

MECHANISMS OF ACTIVATION • 

The most common mechanism of c-erfcB-2 activation is genomic DNA amplifica- 
tion, which almost always results in overproduction of mRNA and 
protein, The c-crM-2 ampHBcation may stabilize the overproduction of i 
nxRNA or protein througli unknown mechanisms. Human breast carcinomas 
with amplification contain 2 to 40 times more c-er&B-2 DNA^' and 4 to 
128 times more c-erbB-l mRNA^«^ than found in normal tissue. Most human 
breast carcinomas with o-erfcB-2 amplification have 2 to 15 times more c-eriB-2 
DNA. IVimors with greater amplification tend to have greater oveiproduc- 
^OQ,i7.52,es non-mammary neoplasms that have been studied tend to have 
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similar levels of c-erbE'2 amplification or overproductibn relative to the corre- 
sponding normal tissue. I 
The second most common mechanism ofc>er/7B'2 activation is overproduc- f 
tion of c-er&B-2 mRNA and protein without amphfication of c-erbB-2 DNA.^^ I 
The quantities of mRNA and protein usually are less than those in amplified 
cases and may approach the small quantities present in normal breast or other 
tissues, i^'^^ The c-erbE-l protein overproduction ividiout mRNA overproduce 
tion or DNA amplification has been described in a few human breast carcinoma 
celllinesf.^^ 

Other rare mechanisms ofc-eri^B-S activation have been reported. Translo- ! 
cations involving the c-er&B-2 gene have been described in a few mammary and 
gastric carcinomas, although some reported cases may represent restriction 
fi:agment lengtib polymorphisms or incomplete restriction enzyme digestions 
that mimic transtocations.^^^'*^"'^^^^ A sirigle point mutation in the transmiBm- 
brane portion of neu has been described in xat neuroblastomas induced by 
etihylnitro5urea>*^ The mutated neu protein has increased tyrosine kinase activ- 
ity and aggregates at the cell membrane. Although there has been specula- 
tion that some of the amplified c-erbB*2 genes may contain point mutationSf^ 
none has been detected in primary human neoplasms.''i»53,8i | 



TECHNIQUES FOR OETECTfNG 0*erbB«2 ACTIVATION 
Dotectton of c-«rbB-2 DNA Amplification 

Amplification of c-^&fi-2 DNA Is usually detected by DNA dot blot or South- 
em blot hybridization. In the dot blot method, the extracted DNA is placed 
directly on a nylon membrane and hybridized with a CrerhB*2 DNA probe. In 
the Southern blot meftod, the extracted DNA is treated with a restriction 
enzyme, and the fragments are separated by electrophoresis, transferred to a 
nylon membrane, and hybridized with a c-er&B-2 DNA probe. In both tech- 
I niques, amplification is quantified by comparing the intensity (mea- 

sured by densitometry) of the hybridisation hands from the sample with those 
from control tissue. 

Several technical problems may complicate the measurement of c-er6B-2 
DNA amplification. First, the extracted tumor DNA may be excessively de> 
' graded or diluted by DNA firom stromal cells. Second, the c-eriB-2 DNA 

probe must be carcfiiliy chosen and labeled. For example, oligonucleotide o- 
BrbB'2 probes may not be sensitive enough for measuring a low level ofc-erfeB- 
i 2 amplification, because diploid copy numbers can be difficult to detect (unpub- 

; lished data). Third, the total amounts of DNA in the sample and control tissue 

; must be compensated for, often with a probe to an unamplified gene. Many 

studies have used control probes to genes on chromosome 17, flie location of o- 
to correct for possible alterations in chromosome number. Identical 
. results, however, are obtained by using control probes to genes on other chro- 

I mosomes,^'^®-** with rare exception.'"' Studies using control probes to the beta- 

1 

I 

I 
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^obin gene must be inteipreted with caution, because one aUele of this gene is \ 

deleted occasionally In breast caicinomas.^ T 

Amplification of c-er&B-2 DNA was assessed by bsing the polymerase \ 

chain reaction (PGR) in one recent study 3* Oligoprimers for the o-er&B-2 gene 1 

and a control gene are added to the sample's DNA, and PGR is performed. If i 
the sample contains more copies of c-er&B*2 DNA than of the control gene, the 

c-6fbB-2 DNA is replicated preferentially | , 

Detection of c*efdB-2 mRNA Overproduction 1 \ 

Oveiproductton of c-er&fi-2 mRNA usually is measured by BNA dot blot or i 

Northern blot hybridization. 9oth techniques require extraction of RNA but \ 
otherwise are analogous to DNA dot blot and Southern blot hybridization. Use 

of PGR for detection of c^r%B-2* mRNA has been descriEed in'two recjent \ 

abstracts.8fi.i« i 

Overproduction of c-er&B-2 mRNA can be measured by in situ hybridisa- 
tion. Sections are mounted on glass slides, treated witb protease, hybridized § , 
with a radiolabeled probe, washed, treated with nuclease to remove unbound \i ' 
probe, and developed fi>r autoradiography. Silver grains are seen only over 
tumor cells that oveiproduce c-er&B-2 mRNA. Negative control probes are 
used.<^M'^« Our experience indicates that these techniques are relatively insensi- | 
tive for detecting c-er&B-2 mRNA overproduction In routinely processed tis-r \ 
• I ^ sue. Although the sensitivity may be increased by modifications that allow | 
^ - simultaneous detection of o-^&B-2 DNA and mRNA» in situ kybridizaticm sdll | ' 
is cumbersome and expensive (unpublished data). | \ 
All of the above c-6rfeB-2 mRNA detection techniques have several prob- t • 
^ lems that make them more difficult to perform dian techniques for detecting 
\\.^ ' . DNA amplification* One n:iajor problem is the rapid degradation of RNA in ' | 
^ tissue that is not immediately frozen or fixed, la addition, during the detection | - \ 
\ i! I procedure, RNA can be degraded by RNase; a ubiquitous enzyme, which must |' i 
'li be eliminated meticulously J&om laboratory solutions. Third, control probes to f \ 



I i 



* i genes that are uniformly expressed in the tissue of interest need to be carefully i 1 



selected. ' • '\ \ 



Detection of e-efl>B-2 Protein Overproduction | \ 

; The most accurate methods for detecting protein overproduction are I ( 

- "t tfie Western blot method and immunoprecipitation. Both techniques can docu* | ; 

\ u - ment the binding specificity of various antibodies against o-erbB-Z protein. In 

y , Western blot studies, protein is extracted from the tissue, separated by electro- 

! : -| phoresis (according to size), transferred to a membrane, and detected by using an- 

i tibodies to c-er&B-2* In immunoprecipitation studies, antibodies against o-erbB- 

: :* j 2 are added to a tumor lysate, and the resulting proteio-antibody prec^itate is | ■ 

: :j ' separated by gel electrophoresis and stained for protein. Both Western blot and | i | 

1 2 immunoprecipitation are useUd research tools but currently are not practical for 

» dia^ostic pathology. Two recent abstracts have described an emcyme-linked 
immunosorbent assay (ELISA) for detection protein. 
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Overproduction protein is most commonly assessed by various 

Immunoblstochemical t^phniques. These procedures often generate conflicting 
results, which are explained at least partJally by ihrcc factors. First, various 
studies have used different polyclonal and monoclonal antibodies. Because 
some polyclonal antibodies recograze weak bands in addition to flie c-erfcB-2 
protein band on Western blot or immunoprecipitation» the results of these 
studies should be interpreted with caution.*^^'^ Even some monoclonal anti- 
bodies immunoprecipitate protein bands in addition to c-erfcB-2 (pl85).^^®'™ 
Second, tissue fixation contributes to variability between studies. For example, 
• some antibodies detect c-er&B-2 protein only in frozen tissue and do not' react 
in fixed tissue. In general, formalin fixation diminishes the sensitivity of 
immunohistochemical methods and decreases the number of reactive cells.***" 
When ^Bpuin**5 fixative is'njsed," there may be a higher percentage of positive 
cases.^ Third, minimal criteria for interpreting Immunohistochemical staining 
are generally lacking. Although there is general agreement that distinct crisp 
cytoplasmic membrane staining is diagnostic for c-er2iB-2 activation in breast 
. carcinoma, the number of positive cells and the staining intensity required to 
diagnose c-«rfcB-2 protein overproduction varies from study to study and firom 
antibody to antibody. Degradation of o-ertB-2 protein is not a problem because 
it can be detected in intact form more than 24 hours after tumor resection 
without fizaticm or freezing.^ 



ACTiVATiON OF c^rfiB-2 IN BREAST LESIONS 
Incidence of oer6B-2 Activation 

Most studies of c-erbB'2 oncogene activation do not specify histologica] sub- 
types of infiltrating breast carcinoma. Amplification of o-erbB-2 DNA was foimd 
in 19.1 percent (519 of 2715) cf invasive carcinomas in 25 studies (Ikble 1), and 
c-cr&B-2 mRNA or protein overproduction was detected in 20.9 percent (568 of 
2714) of invasive carcinomas in 20 studies. IWelve studies have documented c- 
erbB'2 mRNA or protein overproduction in 15 percent (88 of 604) of carcinomas 
that lacked c-erbB-2 DNA amplification. 

The incidence of c-erbB-2 activation in infiltrating breast carcinoma varies 
with the histological subtype. Approximately 22 percent (142 of 850) of infiltmt- 
ing ductal carcinomas have c-er6B-2 activation, as expected firom the above 
data. Other variants of breast carcinoma with firequent c-erbB-2 activation are 
inflammatory carcinoma (62 percent, 54 of 87), Paget*s disease (82 percent, 9 of 
11)» and medullary carcinoma (22 percent, 5 of 23). In contrast, activa- 
tion is infirequent in infiltrating lobular carcinoma (7 percent, 5 of 73) and 
tubular carcinoma (7 percent, 1 of 15). 

The c-erbB-2 protein overproduction is present In 44 percent (44 of 100) of 
ductal carcinomas in situ and especially comedocardDoma in situ (68 percent, 
49 of 72). The micropapillary type of ductal carcinoma in situ also tends to have 
c-er&B-2 activation.*®*"-^ especially if larger cells are present. The greater fre- 
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quency o{c-erbB-2 protein overproduction in comedocarcinoma in situ, com- 
pared with infiltrating ductal carcinoma, could be explained by the &ct that 
many infiltrating ductal carcinomas arise from other types of intraductal carci- 
noma, which show c-er&B-2 activation inirequentiy. Others have speculated 
that carcinoma in situ with c-i?r&B-2 activation tends to regress or to lose c- 
erfrB-2 activation during progression to invasion.^^-^ Infiltrating and in situ 
components of ductal carcinoma, however, usually are similar with respect to o- 
activation,^i'39 although some authors have noted more hejterogeneiiy of 
the immunohistochemica] staining pattern in invasive thah in in situ carci- 
noma,**-* Activation of.c^r&B-B is infrequent in lobular caxtanoma in situ. If 
lesions contain more than one histological pattern of carcinoma in situ, the c- 
erfcB-2 protein overproduction tends to occur in the comedocarcinoma in situ 
but may include other areas of carcinoma in situ.<2.sf.6a Overproduction of c- 
erbB-2 protein in ductal carcinoma In situ correlates with laiger cell size and a 
periductal lymphoid infiltrate.** 

Activation of c-criB-2 has not been identified in benign breast lesions, 
including fibrocystic disease, fibroadenomas, and radial scars CTable 2). Strang 
membrane immunohistochemical reactivity fi>r o-er&B-2 has not been described 
In atypical ductal hyperplasia, although weak accentuation of membrane staining 
has been noted infirequently.J».«^54 i,^ normal breast tissue, o-eriB-2 DNA is 
diploid, and c-eriB-2 is expressed at lower levels than in activated tumors.^*^®.8a 

These preliminary data suggest that c-er6&'2 activation may not be useful 
Sar resolving many of the common problems in diagnostic surgical pathology. For 
example, c-erbB-2 activation is infrequent in tubular carcinoma and radial scars* 
In addition, because o-erbB-2 activation is unusual in atypical ductal hyperplasia, 
cribriform carcinoma in situ, and papillary carcinoma in situ, detection of c-enbB- 
2 activation in these lesions may not be helpful in their differential diagnosis. The 
histological features of comedocarcinoma in situ, which commonly overproduces 
c*^B-2, are unlikely to be mistaken for those of benign lesions. Activation of 



TABLE 2. c-erbB-2 ACTIVATION IN BENIGN HUMAN BREAST LESIONS 





e^B-2DllA 


c^B-2inRNA 


e-ertiB-2 Protein 


Histological Diagnosis 


AmpiilicBtfon^ 


OverproducOon 


Overproduction 


FlbrocysUc disease 


0/1 o» 






Atypical ductal hyperplasia 






2(weak)/21,« 
1(cytopiasmioyi3» 


Benign ductal hyperplasia 






(V12» 


Sclerosing odenosla 






0/4» 


Fibroadenomas 


0/ie,»(V6« 


O/6MO/33* 


(V21,W0/10,M 
<V8,»0/3« 


Radial scars 








Blunt duct adenosis 








'Breael mastosis" 




0/3« 





"Shown as number ol eases with acUvatlortfnumtierol eases siudted; referenee l« given as a eupeiBcripL 
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oerbB-Z, however, does (avor infiltrating ductal cardnoina over infiltrating 
lobukr-cannnoma. Furdier studies of these issues would be useful. 

Correlation of c-6rbB-2 Activation With Pathologic Prognostic Factors 

Multiple studies have attempted to correlate c-erbB-2 activation with various 
psithologic prognostic factors (Table 3)* Activation of.c-er&&r2 was correlated 
with lymph node metastasis in 8 of 28 series, with higher histological grade in 6 
of 17 series, and with higher stage in 4 of 14 series. Large tumor size was liot 
associated with c-er&B-2 activation in most studies (11 of 14). Tetraploid DNA 
content and low proliferation, measured by Ki*67» have been suggested as 
prognostic fitctors and may correlate with'c-erbB-2 acth^ation.**'' 

Correlatldh of c^rbB-2 Acthratfon Whh Clinical Prognostic Factors 

Various studies have attempted also to coirelate c-erbB-Z activation with clinical 
features that may predict a poor outcome (Table 4). Activation of c-er&B-2 
correlated with absence of estrogen receptors in 10 of 28 series and with ab- 
sence of prog&sterone receptors in 6 of 18 series. In most studies, patient age 
did not correlate with c-er&B-2 activation^ and, in the rest of the reports, o* 
erbB-2 activation was associated with eidier younger or older ages. . 

Correlation of c-eri»B-2 Activation With Patient Outcome 

Slamon et al^^ first showed that amplification of the c-er&B-2 oncogene inde* 
pendently predicts decreased survival of patients with breast carcinoma. The 
correlation of c-erbB-2 amplification with poor outcome was nearly as strong as 
the correlation of number of involved lymph nodes with poor outcome. Slamon 
et al also reported that o-er&B-2 amplification is an important prognostic indica* 
tor only in pattents with lymph node metastasis.''^^^ 

A large number of subsequent studies also attempted to correlate D-eriB-2 
activation with prognosis CTable 5). In 12 series, there was a correlation be- 
tween c-«rAB-2 activation and tumor recurrence or decreased survival. In five 
of these series, the predictive value of o-erbB-Z activation was reported to be 
independent of other prbgnbstic factbrs. In contrast, 18 series did not confirm 
the correlation of o-erbB-l activation with recurrence or survival. Four possible ! 
explanations for this controversy are discussed below. 

One problem is that C'erbB'2 amplification correlates with prognosis 
mainly in patients with lymph node metastasis. As summarized in Table 5, most 
stxtdies of patients with axillary lymph node metastasis showed a correlation of 
c-erfeB-2 activation with poor outcome. In contrast, most studies of patients 
without axillary metastasis have not demonstrated a correlation with patient 
outcome. Table 6 summarizes the studies in which all patients (with and with- 
out axillary metastasis) were considered as one group. There is a trend fi>r 
studies with a higher percentage of metastatic cases to show an association 
between c-erbB'2 activation and poor outcome. Thus, most of the current 
evidence suggests that e'er&B-2 activation has prognostic vahie only in pattents 
with metastasis to lymph nodes. 



17: X4 I'AIL 3X0 206 5971 

o 



INFO 6 



o 



1^011 



§ 

8 
3 



J 
si 

'I 



I 



I 

I.I 



I- I 



in a tvj oo n 



.1 i. 
§ 3 



ll 
I ^ 



I I 



I I* 



I, 



a S» 



1^ 

I I 



1^ 




tiilt nil mill ii 



d d 

V A 



ifl S Ui 



W W 1— 



<Q ^ lO 

Q 

ei d o 
V A 



111 



41 

£ 



174 



12Z09/2003 17:15 FAX 310 208 5971 INFO 6 

o 



o 



@013 



178 TP. SINQLETON ANDIQ. STTRtCKLEn 



TABLE 5. CORRELATKDN OP J}B«2 ACTIVATION WfTH OUTCOME (M PATIENTS 
WITH BREASnr CARCINOMA 



Number off mtfintB 



mm 

Type of ATelatCastrslo 





c-erbB-2 




Ax///ery . No 


StetlstlcBl 






AcHvatlOA^ 


Tola/ 


l^mph Nodes jifefaeiasrB 


AtNdysIs" 


Rieferenee 


<0.05 


DNA 


176 




ut 
m 


87 . 


<0.06 


DMA 






f 1 


80 . 


<0.05 


DHA 


57 




u 


65 


<0.0S 


DNA 


41 




1 1 

u 


93 


<0.0S 


mRNA 


69. 




II 




<0.05 




102 




M 


10 1 


<0.05 


ONA 






Uk 
w 


81 


<0.G5 


DNA 






U 


• 17 


<0.05 


DNA 




01 

511 


II 


Si 


<0.05 


ONA 




88 


M 

Ifi 




<0.05 


ProtelrhWB 




350 


M 


85 


<0.0S 


Protein 




82 44 


U 


101 


0.05-0.15 


DNA 


57 




u 


111 


0.05-0,15 


Protein 


189 




M 


92 


0.05-0.15 


Protein 




120 


U 


86 


>0.15 


DNA 


130 




U 


113 


>0.15 


DNA 


122 




M 


4 


>0.15 


DNA 


50 




U 


44 


>0.15 


mRNA 


57 




u 


50 


>0.15 


Protein 


280 




M 


88 


>ai5 


Protein 


195 




U 


11 


>0.15 


Protein 


102 




U 


38 


>0-15 


Protein 




137 


U 


17 


>0.15 


ONA 




181 


M 


81 


>0.15 


DNA 




159 


U 


17 


>0.15 


ONA 




73 


U 


87 


>0.15 


Proleln-WB 




378 


u 


85 


>0.15 


Pro1e!n>Wa 




192 


u 


17 


>0.15 


Protein 




141 


u 


86 


>0.15 


Protein 




41 


u 


40 



"The endpolnts d these sludlaa were tumor rBcurrerice 

etbB'Z aotfvatkm and a pooror paUent autcomo Is statistically ergniflcant at <0.05, l80f equtvocal'slenlRcQnoe 
at O.OS to 0.1 5. and Ib not signllleant at >o:i S. 

<>Shown as v/sflobls measured, tettera "WB" mdleaie easay by Western bh^* the other protein studies u&ed 
Immunohlstochamlcal mettiodS. 

^ " fnuHtverlGrta statistical anfilysls: U » unlvBriale stafistlcal analysts. 
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TABLE 6. PERCENTAGE OF BREAST CARCINOMAS WITH METASTASIS COMPARED 
WTH PROGNOSTIC BIQNIF^NCEOF ACTIVATIOM 



%Ql tumors with 
lymph node 
metastasis In 
each study 



70- 



60- 



60- 



40- 



71 (DNA)« 



61 (ONA)» 

69pNA)S' 
58(Prote]ny" 



64(0NA)ii« 



42(PiotBln)« 



/><0.05 



0-05<P<0.1S 



64(mRNA)B 
^1(DNA}< 



57 (DNA)'« 
55(Proteln>» 

4e(Prolein)^» 
46(Proteln)n 



—I 

p>o:i6 



P br correlation of aotlvalton wim peflem ouic^ 

Ea^ slud/a parcantage of breart carelnoroaB tuflh mstasla^ is compand wWiIhe laofietafon tntnaen o- 

activation arvi outcome. These dalaifwludeoiuy theses^ 
cancer patients, whether or not they had axRIaiy meiastasia. Superscitpta are the raloranoes. In pan 
are types olo-ei2i8-2 activation. Pvalues are Intarpreied asm Tables. 



A second problem is that various types of breast carcinoma are grouped 
togetjber in many survival studies. Because the ciiirexit literature suggests that 
c*er&B-2 activation is infrequent In lobular carcinoma, studies that combine 
infiltrating ductal and lobular carcinomas may dilute the prognostic effect of c- 
er&B'2.activation In ductal tumors. In addition, most studies do not analyze 
inflammatory breast carcinoma separately. This condition frequently shows c- 
eriB-2 activation and has a worse prognosis than the usual mammary carci- 
noma, but it is an uncommon lesion. 

A third potential problem is the paucity of studies that attempt to correlate 
c*erbB-2 activation with clinical outcome in subsets of breast carcinoma without 
metastasis. Two recent abstracts reported that in patients without lymph node 
metastasis who had various risk factors for recurrence (such as large tumor size 
and absence of estrogen receptors), overexpression predicted early 

recurrence. patients with ductal carcinoma in situ, one small study found 
no association between tumor recurrence and c-^ffeB-2 activation.^P 

A fourth problem is the laclc of data regarding whether the prognosis 
correlates better with Q-erbB-2 DNA amplification or with mRNA or protein 
overproduction. Most studies that find a correlation between c^bl^Z activa- 
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tion and poor patient outcome measure MriB-2 DNA amplification (Table 5), 
and breast carcanoma patients with greater amplification of may have 

poorer survival, ^^.a! Recent studies suggest diat amplification has more prognos- 
tic power than oveTproduction»".3i35 but the clinical significance of c-erfcB^2 
overproduction without DNA amplificatipn deserves further research, Few 
studies have attempted to coixelate patient outfcome with mBNA 
overproduction, and many studies of c-«rfrB-2 prot^ overproduction use rela- 
tively less reliable metliods sudi as immunohistochemical studies with poh«^ 
clonal antibodies. 

Comparison Of (>erdB-2 Activation With OtherOncogenes In 
Breast Carcinoma 

Other oncogenes that may have prognostic impLcations in hun^an breast cancer 
are reviewed clscwhere.^aoe iij^ section will be restricted to a comparison 
between the dinical relevance of c-er&B-2 and these other ono(»genes. 

The c-myc gene is often activated in breast carcinomas, but c-rw/c activa- 
tion generally has less prognostic importance than c-er6B-2 activation.****^""." 
One study found a correlation between increased mRNAs of c-eribB-2 and c- 
myc, althougli other reports have not confirmed this.M-io« Subsequent research, 
however, could demonstrate a subset of breast cardnomas in which c-myc has 
more prognostic importance than o-er&B*2. 

The gene c-eriB-1 for the epidermal growth fector receptor (EGFB) is 
homobgous with c-erfeB-2 but is infrequent^ amplified in breast cardnomas « 
Overproduction of EGFB, however, occurs more frequently than amplification 
and may correlate with a poor prognosis. In studies Ihat have examined both c- 
eriB-2 and EGFR in the same tumor, c-«r6B-2 has a stronger correlation with 
poor prognostic factors Studies have tended to show no correlation between 
amplification ofc-ertB-2 ando-er6B-l or overproduction of c-er^B-2 and BGFR» 
although at the molecular level EGFR mediates phosphorylation of c^iB-2 
protein,«.n.«i*M.ioo Recent reviews describe EGFH in breast caiclnoma^^^wo 

The genes c-crfrA and ecr-1 are homologous to the thyroid hormone recep- 
tor, and they are located adjacent to c-erfcB-2 on chromosome 17. These genes 
are frequently coamplified with c-er&B-2 in breast cardnomas. The absence of 
c-erfeA expression in breast carcinomas, however, is evidence against an impor- 
tant role for this gene in breast neoplasia.«n AmplificaHon of c.crbB-2 can occur 
without ear-1 amplification, and these tumors have a decreased survival that is 
similar to tumors with both c^r&B*2 and ear-l ampliflcaHon."^ Consequendy, 
c-er2^B-2 amplification seems to be more important than amplification of c-erfcA 
or 

Other genes also have been compared with activation in breast 

cardnomas . One study found a significant correlation between increased c-erfcB- 
2 mBNA and increased mRNAs of/os, platelet-derived growth factor chain A. 
and Ki-«wJ« Allelic deletion of c-Ha-r«w may indicate a poorer prognosis in 
breast carcinoma^^i but it has not been compared with c-cr6B-2 activation. Some 
studies have suggested a correlation between advanced stage or recurrence of 
breast cardnoma and activation of any one of several oncogenes.«.^ 



12/09/2003 17:16 FAX 310 208 5971 



INFO 6 



o 



@016 



o«r&B« ONCOGENE 



179 



ACTIVATION OF o-erl)B*2 IN NON^MAMMARY TISSUES 

tncidence of e-er6B-2 Aetlvatlon in Non-Mammary Tissues 

Table 7 summarizes the norma! tissues in which oerbB-Z expression has been 
detected, usually wiih inunui>ohistochemical methods using polyclonal anti- 



TABLE 7. PRESENCE OR ABSENCE 0Fe*efbB-2 mRNA OR PROTEIN IN 

NORMAL HUMAN TISSUES 



Tissues With 
MfbB«2 
mRNA 



Tissues Producing Tiaaues Lacking Tissues Lacking 
»er6B-2ProMln* e^iftfrSmRNA , o«iiB4Pffot0in 



Skln^i 



Stomach^ 

Jejunum^* 

Colon" 

MdneyM 



Llvei« 



Fela! bramw 

Thyroid* 
Utenis» 



Placenta^ 



Epldecmis" 
External root shealh^ 
Eccrlne sweat glancf^ 
Fetal oral mucosa^ 
Fatal esophagus^ 
Stomach?«» 
Petal tntesttne«' 
Small IntBStinB^tt 
CotofF« 
Fetal kldnsy«* 

Fetal proximal tutMJie^ 
Distal tutJUle^ 
Fetal collecting ductus 
Fet^ renal pelvis"* 
Fetal ureter^z 
Hepatooytes" 
Pancreatic aeini^ 
Pancreatic ductal 
Endocrine celts of laleis 
of Langartians^ 

. Fetal trachea^. 
Fetal bronchloleses 
Bronchlolefi^ 



Fetal ganglion cslls^ 



Kidneys" 



Ovary* 
Blood vessele^ 



Postnatal oral muoo8a*> 
Postnatal esophagus^ 



Glonjenalus?" 

Postnatal Bowman's cap&ule"^ 
Postnatal pmdmal tubule's 

Postnatal collecting duct" 
Postnatal renal patvls^ 
Postnatal f6lalureter«> 
Uver«ws 



Pancreatic Islets*^ 

PosUiatal tracheal 
Poslnatalb(onchlolea<> 

Postnatal alveolF^ 
Postnatal bralnv 
Postnatal ganglion cells« 



Endoth8lIum*2 

Adrenooortioal ceOs^ 
Postnatal thymus^ 
Fibroblasts^ 
Smooth musde cq\]^ 
Cardlafi muscle cellar 



"This protein study used Wbslem bkHs; tha test used bnmunohlslochemlcal methods. 
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bodies. Only a fefw studies have been peifinmed, and some of these do not 
demonstrate convincing cell membrane reactivity in the published photo- 
graphs. The interpretations in these studies^ however, are listed, with the 
caveat that these findings should be confirmed by immunoprecipitation or 
Western or RNA blots. Production of c-erbB-2 has been identified in normal 
epithelium of the gastrointestinal tract and skin. Discrepancies regarding c- 
erbB-2 protein in other tissues could be due, at least in part, to differences in 
, techniques. 

The data on o-erbB-2 activation in vaiious non-mammary neoplasms 
sho\ild be interpreted with caution, because only small numbers of tumors have 
been studied, usually by immunohistochemical methods using polyclonal anti- 
bodies. Studies ijsing cell lines have been excluded, because oeD culture can 
induce amplificatioh and overexpression of other genes, ^althou^ this has. not 
been documented fi}rc-«rbB-2. 

Activation of c-erfeB-2 has been identified in 32 percent (64 of 203) of 
ovarian candnomas in eight studies (Tftble 3). One abstract^ stated that ovarian 
carcinomas contained significantly more c-erlrB'2 protein than ovarian non- 
epithelial malignancies. Another report^^ showed that 12 percent of ovarbui 
carcinomas had crerbB-Z overproduction without amplification. 

Activation of c-erfeB-2 has been identified in 20 percent (40 of 198) of 
gastric adenocarcinomas in seven studies, including 33 percent (21 of 64) of 



TABLE 8. c-erbB*2 ACTIVATION IN HUMAN GYNECOLOGIC TUMORS" 















iaRNA 


Protein 




c^rbB-2 DNA 


Ove^ 


Over- 


Tumor Type 


AmpHflcalicn 


production 


production 


Ovary— carcinoma, not otherwise 


3iyi20.«1/1iw 


23/67" 




specffied 


O/B,W(i/6M0/3,i« 
Q/2.» 0/1110 




36/72"i 


Ovary— serous (papfflaiy) cardnoma 


2^,<wi^,»«0/B« 






Ovary— endbrnetritold carctnoma 


0/3ito 






Ovary— mucinous carcinoma 


iy2,'wo/i" 






Ovary— clear ceil carcinoma 


0ffi,^«0/1» 






Ovary— mixed epiiheflal carcinoma 


0^ 






Ovary— endometrioid borderline tumor 


0/1« 






Ovary— mucinous borderline turrwr 


0/3^ 






Ovary— fierous cystadenoma 








Ovary— muc(nou$ cystadenoma 








Ovary— sclerosing stromal tumor 


0/1 « 






Ovary— flbrothecoma 


0/1^ 






Uterus— endometrial adenocarcinoma 


0/4«0/1"« 







'Shown as number of easea wKh ampflflcetion (or overproducllonyiotai number el cases stucfle« referenca i& 
given as superaoripL All prated studies used Immunohfstoehemreal methods. 



X2/U^/^UUd 17: 17 KAA 310 20» 5971 

O 



INFO 6 



o 



o«riiB^ ONCOGENE 



181 



intestinal or tubular subtypes and 9 percent (4 of 47) of diffuse or signet ring cell 
subtypes (Table 9). Activation of o-erbB-Z has been delected in 2 percent (6 of 
281) of colorectal carcinomas, althougb an additional immunbhistochemical 
study detected protein in seven of eight tissues fixed In Bouin s solu- 

tion. One study found greater unmunohistochemical reactivity for Crerbh-2 
protein in cjolonic adenomatous polyps than in the adjacent nohnal epithelium, 
using Bouin's ibcative. Lesions with anaplasd^j features and progression to ihval 
si^ carcinoma tended to show decreased immunohistocheraical reactivity for c- 
erdB.2 protein » Hepatocellular carcinomas (12 of 14 cases) and cholangiocarci- 
nomas (46 of 63 cases) reacted with antibodies against o-erfeB~2 in one study» but 
some of these "positive" cases showed only difiuse cytoplasmic staining, which 



TABLE 9. o^rbB-2 ACTIVATION IN HUMAN GASTROINTESTINAL tUMOHS- 







MrliB-2 


Tumor Type 


c^erM-aDNA 


Pitoteln 
Over- 


Amplication 


prodimiton 


Esophagus— squamous cell carcinoma 


0/1 w 




Stomach**H:arctnoma, poorly dtfferentlaied 






Stomach— adenocardnoma 




4/27.»a/10« 








Stomach— cardnoma, Intestinal or tubular type 


5/10<« 


i6/e4» 


Stomach— carcinoma; diffuse or signet ring cell type 


0/2»» 




Coloractum — cardnoma 


aMOt^ 1/45 






1/45,«1/45* 






0/40," Q/32,^(>' 0/3^ 




Colon— villous adenoma 






Colon— tubulovlKous adenoma 






Coton— mbular adenoma 




19/I0a» 


Colon— hyperplastic poiyp 






IntesSne— letomyosarobma 




0/1« 


HapatdoeUular carcinoma 


0/12111 


12/14.»0/25' 


Hepatoblastoma 


0/16T 




Cholangtocarcinoma 




46/63« 


Pancreas— adenocarcinoma 




2/80,^taQ/2W 


Pancreas— acinar carcinoma 




0/1« 


Pancreas— dear cell cardnoma 




0/241 


Pancreas— large celt carcinoma 




0/3« 


Pancreas— signet ring carcinoma 




0/141 


Pancreas— chronic Inflammation 




0/l4<ic 



*Shown as numtief of cases with ampflfication (or overproductloiDrtotal number of cases studied: referenoe Is 
given as superscript. All protein studies used tmrminohlstochemlcal methods. No studlea analyzed tor o^/6&- 

tissues rtxed In Bouln's fiolution. 

*Only cases with distinct membrane staining are Interpreted as ahowlno &«f>B*2 overprodudlon. 
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TABlEig e-ei;faB>2ACTfVATI0N IN HUMAN PUIMONARY TUMORS« 



TionorTypa 


o«fbB-2 0l4A 
Ampllflcatlon 


Protein 
Overproduction 


Non-small cell carcinoma 


2/60,WQ^e0w 




Epidemiold cardnoma 






Adenocarcinoma 


0/21 « 1/13,w Q^,"i mf^ 0«<«' 


4/12* 


Large ceH carcinoma 


0/9,«0/B» 




Small can carcinoma 




OM,sBQ^ 


Caicfrwld tumor 


0/1« 


0/3W 



^mm as nu mb« or oasos vrith am pSfioatlon (or oveTpioductlonytotal number of casod etuM; raferonce b 
jven aa eupeieet^ All protein studies used Immunohlsbtchemlcal meihoda. No studies ana(yzod for o«tB* 
2mRNA 



does not indicate activation in breast neoplasms,* Also, some panoe- 

atic carcinomas and chronic pancreatitis tissue had cytoplasmic immunohisto- 
chemical reactivity for oerbB-Z protein, in addition to the xare case of pancre- 
atic adenocarcinoma with distinct cell nr^embrane staining.*^ 

Tables 10 through 14 summarize the studies of o-flr6B-2 activation in odier 
neoplasms. The c-erfcB-2 oncogene is not activated in most of these tomors! 
Activation of c-firbB*2 has been detected in 1 percent (4 of 299) of pubnonary 
non-small cell carcinomas in nine studies, although oiie additional leportw 
found c^eriB-2 protein overproduction in 41 percent (7 of 17). Renal cell carci- 
noma had c-erf>B-2 activation in 7 percent (2 of 30) in four studies. Overproduo- 
tion of c-eriB-2 protein was described in one transitional cell carcinoma of the 
urinary bladder, a grade 2 papillary lesion."^ Squamous cell carcinoma and basal 
cell carcinoma of the skin may contain c-«r6B-2 protein, but It is not clear 

TABLE 11. fr<trfrB-2ACTIVA-nON IN HUMAN HEMATOLOGIC PROLIFERATIONS* 







e-e/ibB-2 








fflRNA . 


Protein 


Tumor IVpe 


c^erbB-^DNA 


Over- 


Over- 


Amplification 


production 


production 


Hematologic mal^nandes 








Mal^ant lymphoma 




0^1' 


0/15« 


Acute leukemia 


0/14^ 




Acute lymphoblastic leukemia 








Aoute myelobiasQc teukemfa 








Chronic leukemia 








Chronic lymphoc^o leukemia 


0/6107 






Chronic myelogenoua leukemia 


(Ve»or 






Myeloprdlferaflve disorder 


0/1" 







"Shown as numbar of oases wUh ampRflcatlon (or ovarpioducflooVtotel number ol cesee studied; reference to 
given as superscript. All proieln studies used immurwhtelochemica) methods. 
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TABl£ la, ACTVATION IN HUMAN TUMORS OF SOFT TISSUE AND BONE- 



Tumor Type 


MfibB-2 0NA 

. Amplifieatton 


Sarcoma 




Maffgnant fibrous hlstiooytor 


na O/ro' 


Liposarcoma 




Ploomorphib sarcoma 




Rhabdomyosarcoma 




Osteogenic sarcoma 




Chondrosarcoma 


0/1107 


Ewing's sarcoma 




Schwannoma 





given as supeiscrlpL No studies analyud ror c^rbB-Z mRNA or owtM protein. 



cases studted; rderenoeis 



whether the protein level is increased over that of nonnal skin » Thyroid 
carcinomas and adenomas can have low levels of increased mRNA 
One abstract , described low-level DNA amplification in one of ten 

salivary ^and pleomoiphic adenomas.^ 

Correlation of Activation WlHi Patient Outcome 

Very few studies have attempted to correlate c-erfcM activation in non- 
mammary tumors with outcome/ Slamon et al« showed that c-erfcB-2 amphfica- 
tion or overexprcssion In ovarian carcinomas correlates with decreased survival, 
especially when marked activaUon is present. However, they did not report the 
stage, histological grade, or histological subtype of these neoplasms. Another 
study of stages 111 and IV ovarian carcinomas found a correlation between 
decreased survival and crerbB-2 protein overproduction, but not between sur- 
vival and histological grade. " One abstract stated ^t c-erb^2 protein overpro- 
duction in 10 of 16 pulmonaiy adenocarcinomas correlated with decreased 
disease-free intetvaL-ra Another abstract described a tendency for immunohisto- 



TABLE 13, ACTIVATION IN HUMAN TUIWORS OF THE URINARY TRACP 



Tumor Type 



DNA 

Amplincaiion 



mRNA 
Over- 
production 



1/6,67 1/4,10? 0/5" 
0/45^ 



0/16W 



c-erbB-2 
Protein . 
Over- 
production 



Kidney— renal cell carcinoma 
Wilms' lumof 

Pfoslate — adenocarcinoma — ^ 

Urinary bladder— carcinom a — _ 

•Shown 89 number of cafies wrfih ampimcatlon (or overproduction)/ Wal number of cam studied: raterenoe Is 
given aa superscript. AH protein studies used bnmunohbtochemical methods. 



Q/23» 
1/48W 
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TABLE 14. frerdfr2 ACTIVATION iN MISC£LUNEOUS HUMAN TUMORS' 



Tumor Type 



ONA 

AmpllfieatlDn 



Mf6B-2mRNA 
Overproduotlon 



Protein 
production 



Skin— malignant melanoma 
Skia head and neck— equamoue 
cell carcinoma 

SRe not etated-^-squamous cell 
carcinoma 

Salivary gland— adenocarcinoma 
ParotW gtand— adenokl cyalle 
carcinoma 

^ThyroW— anaplasuo carcinoma 
Thyroid— papillary carcinoma 
Thyroid— Adenocarcinoma 
TTiyrold— adenoma 
Neuroblastoma 
Meningioma 



0/7»» 



1/1 » 



0/11 
0/1W 



d(k)WleveIsV5t 
10owlevel&)/2f 



0/1 0« 



•TKiovm aa nuirt)er of cases wtth am 

givan as superscript All piola^ studies U89d (mmunohlatochemtoal mathodb. 



chemical reactivity for c-erfcB-2 protein to correlate with higher grades of pro^ 
tabc adenocaicinoina.9^ Additional prognostic studies of ovarian cateinoinas and 
other neoplasms are needed. 



SUMMARY 

Activation of the c-cr 65-2 oncogene can occur by amplification of c-«r*B-2 
DNA and by overproduction of c^r6B.2 mRNA and protein. Appirorf- 

matcly 20 percent of breast carcinomas show evidence of c^jbB*2 activation, 
which correlates with a poor prognosis primarily in patients with metastasis to 
ajallary lymph nodes. Studies that have attempted to correlate c-cr&B.2 activa^ 
tion with other prognostic factors in breast carcinoma have reported conflicting 
conclusions. The pathologic and clinical significance of o-eriB-2 activation in 
other neoplasms is unclear and should be assessed by additional studies. 
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4 ]?aulPola]!ds, lii,D., declare and s^^ - 

Li. ^r^fcL^^-^' I>«Partment of Biochemistry of the Michigan 

'^f^°^*5'^™"**^^8^i^J®ctmy laboratory 
di^erential expr^sion of various genes in tumi^ceU^ 

.I^e pujrase of this jesearch is to identic prot©im,.that ar© abmdantiy caused 

c^temtumor cells and that are either (i) not -pressed, or ffi at 
Jowerlevds, on contending nomal cells. Weill siS^l^Tzl^,. 

^'"'^Sf ?^°'"^S^^P«>^ii>s«' WhensuchatumoSim^^SMis^^^^^^ 
Identified, one can produce an antiTM>dy that i^gnizfesSXto Si 
^hanantitody finds use in the diagnosis ofh^c^^^Z^^^ 
seirve.as an e^ctive therapeutic in the treatemt of 



that are differentially expressed in one tissue or cell type relative to an^^ In 
course of our research using micioanay analysis. We^velS^ ^ 
approximately 200 gene transcripts that are pres^it in hu^Si toor cells at 
significantly higher levels than in corresponding nonnal hi^mf^s To date 

^ ^ differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of tee 
antigen proteins in both human cancer cells and coLpSg nor^Sls We 

1 From the mRNA and protein ex]pres8ion analyses described in paraeraoli 4 
level of mRNA present m any particular ceU type and the level of protei^ 



expressed from that mRNA in that cell type. In appix)xiinately 80% of our 
observations we hayei found that increases in thei level of a particular jfjRNA 
coirelates with changes in the level of protein expreissed fix)m that mKNA Vidien 
human tumor cells are compared witii Iheir correspoiidiiig norinal cells. 

.6. Based upon my own experience accumulated in more than 20 years of 
reseaith, including the data discussed in paragraphs 4 and 5 above and my . 
. knowledge of the rdevant scientific literatia^^ 

. opinion that for human genes, an increased level of mKNA in a tumor cell relative 
. to a normal cell typically correlates to a similar increase in abimdance of the 
encoded protein in the tumor cell relative to the noimal cell. In fact, it remains a 
coiti^ dogma in molecular biology that increased mRNA levels dre predictive of 
corresponding increased levels of the encoded protein. . While there have been 
published reports of genes for which such a correlation does not exist, it is. my. 
opini6n that such reports are exceptions to tiie commonly understood gaieral rule 
lhat increased mRNA levels are predictive of conesponding . increased levels of the 
encoded protein. 

7. I h^eby declare thatall stateinenfs made herein of my own knowledge are 
tnie and tiiat all statements made on information or belief are believed to be true, 
arid fiirther that these statements were made with the knowledge that willful felse 
statements and the like so made are puipshable by fme or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such wiimd 
statements may jeopardize the validity of the application or any patent issued 
tiiereon. 



Dated : S/oVoY 



By: 

PaulPolakis,Ph.D. 



CURRICULUM VITAE 



PAUL G. POLAKIS 
Staff Scientist 
iGenentech, ihc 
1 DNAWay.MS#40 
8. San FrandsGO, CA 94080 



EDUCATION: 

■ ■ ■ - . • * 

Ph.D., Biochemistry, Departonent of Biochemistry, 
Michigan State University (1984) 

- B.S., Biology. College of Natural Science, Michigan StatQ University (1977) 



PROFESSIONAL EXPERIENCE: 

2002-present Staff Scientist, Genentech, Inc 

S. San Frdncisod, CA 

1999-2002 Senior Scientist, Genentech, Inc., 

S. San FfBncisoo, CA 

1997-1999 Research Director 

Onyx Phanriaceuticals, RIdimond. CA 

i • ■• • s ■ 

i ■ 

.1992-1996 aenior Scientist, Project Leader. Onyx . 

Pharmaceuticals, Richmond, CA 



1991-1992 Senior Scientist, Chiron Corporatton, 

Emeryville, CA. 

1989-1991 . Scientist, Cetus Corporation, Emeryville CA, 

1987-1989 Postdoctoral Research Associate, Genentech, 

Inc., South SanFranclsco,iCA. 



1985-1987 



Postdoctoral Research Associate, Department 
of Medicine, Duke University Medical Center, 
Durham, NC 



1984-1985 Assistant Professor, Department of Chertilstry, 

, Oberiin College, Oberiin, Ohio 

1980-1984 Graduate Research Assistant, Department of: 

Biochemistry, Mlchig|an State University. 
East Lansing; MIchlgah 



PUBLICATIONS! 

1. Polakis, P 6, and Wilson, J. E. 1982 Purification of a Highly Bindable Rat Brain 
Hexokinase by High Perfomiance Liquid Chromatography. Blochem. Blophys. 
Res. Commun. 107, 937-943.. 

2. polakis, P.6; and WBsbn, J. E. 1984 Proteolytic Dissection of Rat Brain 
Hexokinase: Determination of the Cleavage Pattern during Limited Digestion with 
Trypsin. Arch. Blochem. Biophys. 234, 341-352. 

3. Polakis, P. G. and Wilson, J. E. 1 985 Ah Intact Hydrophobte N-Termtrial 
Sequence is Required for the Binding Rat Brain Hexokinase to MHochondiia. Arch. 
Blochem. Biophys, 236, 328-337. 

4;.Uhjng, RJ., Polakis^P.G. and Snyderman, R. 1987 Isolaton of GTP-blndlhg 
Proteins from MyeioW HL60 Cells. J. Biol. Chem. 262, 1 6575-1 5579. 

5. Polakis, P.G., Uhing, R.J. and Snyderman, R. 1 988 The FomfiylpeptWe 
Chemoattractant Receptor Copurifies with a GTP-binding Protein Containing a 
Distinct 40 kDa Pertussis Toxin Substrate. J. Biol. Chem: 263, 4969-4979. 

6. Uhing< R. J., Dillon, S., Polakis, P. G., Tmett, A. P, and Snydennan. R. 1 988 . 
Chemoattractant Receptors and Signal Transduction Processes in Cellular and 
Molecular Aspects of Inflammation ( Poste, G. and Crooke, S. T, eds.) pp 335-379, 

7. Polakis, P.G., Evans, T. and Snydemian 1989 Multiple Chromatographic Fomis 
of the Fomiylpeptide Chemoattractant Receptor and their Relationship to GTPr 
binding Proteins. Blochem. BlopKys. Res. Commun. 1 61 , 276-283. 

8. Polakis, P. Q., Snydenrian, R. and Evansi T. 1989 Characterization of G26K, a. 
GTP-blndIng Protein Containing a Novel Putative Nucleotide.Binding Domain. 
Biochem. Biophys. Res. Coitiun. 160, 25-32. 

9. Polakis, P., Weber,R.F., NeYins,B., Didsbury. J. Evans.T. and Snydemnan, R. . 
1989 Identification of the ral and raci Gene Products, Low Mo\ecu\k Mass GTP- 
binding Proteins from Human Platelets. J: Biol. Chem. 264. 16383-16389. 

.10. Snydermari, R., Perianin, A., Evans. T., Polakis, P. and.Didsbury, J. 1989 G 
Proteins and Neutrophil Function. In ADP-Rlbosylatlng Toxins and G Proteins: . 
insights into Signal Transduction. ( J. Moss and M. Vaughn, eds.) Amer. Soc. 
Microbiol, pp. 295-323. 



m. Hart, MJ., Pollalkis, ^.0., Evans, T. and Cenion^, RA.1990 The Identliflcation 
and Charaterization of an Epfdemial Growth Factor-Sfimulated Phosphorylation of a 
Specific Low Molecular Mass GTP-bindln^ Protein In a Reconstituted Phospholipid 
Vesicle System, dl. ®I®L aih)®iniD, 265, 5990-6p01. 

12. YatanI, A., Okabe, K., PoOaMs, !P, Halenbeck, R. McQormlck, F. ar^d Brownj, A. 
U. 1990 ras p2i and GAP Inhibit Coupling of Muscarinic Receptors to'AtriaO 
Channels. CellD. 6.1, 769-776. ^ 
• • • • ■ . • ■ • • . ■ ■ ? ' 

13= IWunemitsu, S., Ihnis, M.A., ClarKi R., McComnlck, P., Ullrich, A. and PoIaMs, 
PM: 1990 i\/iolecular Cloning and Expression of a G25K cDNA, the Human Homolog 
of the Yeast.Cell pycle Gene CbC42. Rflol Cell B8ol. 10, 5977-5982; 

Uo PoOakls, P.G. Rubinfeld, B. Evans, t. and McCormick, F. 1991 Purification of 
Plasma Membrane-Assoclaited GTPase Activating Protein {GAP). Spedfic for rap- 
1/krev-1 from HL60 Cells. Proe. isSaHB. Acad. ScS, y SA • 88, 239-243. 

15, Moran, F., PoIaWs, P., McComiIck, F., Pawspn, T. and Ellis, C. 1991 Protein 
Tyrosine Kinases Regulate the Phosphorylatibn, - Protein Dnteractitihs, Subcellular 
Diistributton, and Activity of p21 ras. GTPase Activating Protein. Mol CoOI. Bloi. 11, 
1804-1812 . j 

16. - Rubinfeld, B.. Wong, G., Bekesi. E. Wood, A. McComiick, F, arid Po!lalkiis,.P, ©. 

1991 A Synthetic Peptide Conresponding to a Sequencis in the GTPase Activating 
Protein Inhibits p2iras Stimulation and Promotes Giianfne Nucleotide Exchange. 

lnfteriniafiI..J. PeptlWe amid Piroi Res. 38, 47-53. " . ■ ' 

17, Rubinfeld, B., Munemitsu, S., Crari<, R., Conroy, L., Watt, K., Crosier, W., 
McCormick, F., and PoIa^Ms, P. 1991 Molecular Cloning of a GTPase Activating 
Prdtein Specific for the Krev-I Protein p2irapl. CelD 65, 1033-1042.. ^ 

18. Zhang, K. Papageorge, A., G., Martin, P., Vass. W. C. dah, Z., PoOaMs, P., 
iWcGbrmick, F. and Lowy, D, R, 1991 Heterogenouis Amino Acids in RAS and 
Rapl A Specifying Sensltivily to GAP Proteins. Sdomic© g54, 1 630-1634. 

1S. Martin, G., Yatani, A., Clark. R.. PoyakBs, P., Brown, A. M. and McComiick, F. 

1992 GAP Domains Responsible for p2iras-dependent Inhibition of Muscarinic Atrial 
K+ Channel Currents. ScHence 255, 192-194. 

20. McConnick, F., Martin, G. A., Clark, R., Bollag, G. and PoDakiis, P . 1992 
Regulation of p21ras by GTPase Activating Proteins. Cold Spring Harbor Symposia 
om Q.iuantilJatSveStollogy; Vol. 56," 237-241.- 

21. Pronk, G. B., Polaki®, P., Wong, G., deVries-Smits, A. M„ Bos J. L. and 
McCormick, F. 1992 p6bv-src. pan Associate with and Phosphorylate the p2iras 
GTPase Activating Protein. Oncogdm 7,389-394. 

22. PoUklB P. and McCormick, F. 1992 Interactions Between p2iras Proteins and 
Their GTPase Activating Proteins. In Cancer Suirvevs ( Franks, L. M.^ ed.) 12, 25- 
42. .. 



23o Wong, G., fuller, O., Clark, R., Conray, L., Moran, M ., lF©liilkls,IP. and . ' 
Mopomilck, F. 1992 Molecular colorling and nucleic acid binding pioperties of tie 
GAP-as80ciated tyrosine phosphoprotein p62. M 69, 551-558. 

. U. Pelaki©, P., Rublnfeld, B:.and.McComiick, F. 1992 Phosphorylation of rapliQAP 
In vivo and by cAMPndependent Kinase and the Cell Cycle p34cd(^ Kinase in vitro. 
J. IBI©fl. Cllii©inni; 267, 10790-10785. ■ 

25o McCabe, P.C.. Haubrauck, H., FoBakSs, P„ McComiick, F., and Innis, M. A. 
1992 Functional Interactions Between p2 1 ^ and Components of ihe Buddlhg 
pathway of Saccharomyces cerdvisfae. Mo\. Cell). Bioi. 12, 4084-4092. 

2§= Rublnfeld, 8., Crosier, W.J., Albert, 8., Conroy.L;, Clark, IR., McCo.rrtilek, F. and 
Polakis, P. 1992 Localizatidn of the raplGAP Catalytic Domain and Sites of 
Phosphorylation by Mutational Analysis: Mol CelD . Biol 12. 4634-4642. . 

27; Ando, S., kalbuchi, K., Sasaki, K., HIraoka, t., NIshlyama, T.,. Mizuno, t., . 
Asada, M., Nunoi, H., Matsuda, I., iViatsuura, Y„.PoBak5s, P., iVjcCormlck, F. and 
Takai, Y..1992 Post-trdnslatlonal processing of rac p21s is important both for their 
interactloh with the GDP/GTP exchange proteins and for their activation of NADPt^ 
oxidase. 4o M. ©Ihieinn).. 267, "25709-257.1.3. 

28o Janoueix-Lerosey, I., FoOakBs, P.. Tavitiah, A. and deGunzberg, J; 19i92. 
Regulation of the GTPase activi|^.of the ms-related rap2 protein. BlioGllieinia. 
BiopDri]^. Res. Cooninism 1-89, 455-464. 

29. PolaMs, P. 1993 GAPs Specific for the rap1/Krev-1 Protein, iri GTP-bihdina 
Proteins: the /as-superfamilv. ( J.C. LaCale and F. McCormick. eds.) 445-452. 

30. Polakis, P. and McCorrnick, F. 1993 Structural requirements for the interaction 
of p21|ras with GAP, exdiange factors, and its bok)gical effector target J. BM 
Chenni. 268, 9157-9160. 

31 . Rublnfeld, B., Souzd, B, Albert, I., Muller, 0., Chamberlain, S., iWasiarz, F.. ^ 
i\/lunernitsu, S. and PolakSs, P. 1 993 Association of the APC g^ne product vi^th 
bete- catenin. ScSence 262, 1731-:1734. 

32. Weiss, J., Rubirifeltf, S., PollaMs, P., iWcComiick, F. Cavenee, W. A. and Arden, 
K. 1993 The gene for human rapl -GTPase activating protein (raplGAP) niaps to 
chromosome 1 p35-1 p36.1. CytogeniisiL- .Coll Geiniet 66, 18-21. •■ 

33. Sato, K. Y., Fdlakiis, P., Haubruck, H., Fasching, C. L., McConnick, F. and 
Stanbrldge, E. J. 1994 Analysis of the tumor suppressor adtvity of the K-rev gene in 
human tumor ceil lines. Camcer Res.. 54, 552-559. 

34. Janoueix-Lerosey, I., Fontenay, M., Tbbelem, G., Tavitian, A., Polaki!s, P! and 
DeGunzburg, J. 1994 Phosphorylation of raplGAP during the cell cycle. Biipclhieimi. . 

■ liloplhiys.l^es.Conn)mMini. 202, 967-975 

30. Munemitsu, S., Souza, B., Mueller, O., Albert, I., Rublnfeld, B., and Polsikis, P. 
1994 The. APC gene product associates with microtubules in vivo and affects their 
assembly in vitro. Canceir . Res. 54, 3676-3681. 



36. Rubinfeld. B. and Polakis, P. 1 995 Purification of baculovim? produced 
rapl GAP. Methods Enz, 255,31 ' 

37. Polakis, P. 1 995 Mutations in tlie APC gene and their Implications for protein 
structure and function. Current Opinions In Genetics and Deveiopment 5, ^6t71 

38. Rubinfeid, B., Souza, B., Albert, l„ iVlunemitsu, S. and Polaitis P. ,^995 The 
APC protein and E-cadherlh fomn similar but independent complexes 4ith pc-^atenih, 
p>cateninandPlal<pglobin.J.Biol.Chem. 270.5549-5555 

39. Munemitsu, S,. Albert, I., Sbuza, b:, Rubinfeld, B., and Polakis, P. 1995 
Regulation of intracellular p-catenin levels, by the APC tumor suppressor gene. 
Proc. Nati. Acad. Sci: 92, 3046-3050. 

40i Lock, P., Fumagalli, S.. Poia.kis« P. McConmick, F. and Courtneldge, S. A. 1996 
The human,p62 cDNA encodes SiEim68 arid not the rasGAP-associated p62 protein, 
cell .84, 23-24. : 

41. Papkoff, J., Rubinfeid, B., Schryver, B. and Polakis, P. 1996 Wnt-1 regulates 
free pools of catenlns and stabilizes APCrcatenin complexes. Mol. Cell. Biol. 16. 
2128-2134. 

42. Rubinfeld, B., Albert^ I., Porflri, E., Fid, C, Munemiisu, S. and Polakis, P. 1996 
Binding of GSKSp to the APC-p-catenin complex and regulation of complex 
assembly; Science 272j 1023-1026. 

43. Munemitsu, S.. Albert, 1., Rubinfeld, B: and Polalds, P' 1996 Deletion of amlno- 
terrninal structure stabilizes p-catenin iii vivo and pronriotes the 
hyperphosphorylation of the APC tumor suppressor protein. Mol. Cell. 'Biol.16, 
4088-4094. 

44. Hart, M. J., Callow, M. G., Sousa, B. and Polakis P. 1996 IQGAP1, a 
calmodulin binding protein-virftha rasGAP related domain, is a potential effector for 
cdc42H8.EMBOJ. 15.2997-3005. 

45. Nathke, I. S.. Adams, C. L , Polakis, P., Sellin. J. and Nelson, W. J. 1996 The 
adenomatous polyposis coll (APC) tumor, suppressor protein is localized to plasma 
membrane sites involved in active epithelial cell migration. J. Cell. Biol. 134, 165- 
180. 

46. Hart, M. J., Shamna, S., elMasry, N., Qui, R-G., McCabe, P., Polakis, P. and 
Bollag, G. 1996 Identification of a novel guanine nucleotide exchange factor for the 
rtio GTPase. J. Biol. Chem. 271, 25452. 

47. Thomas JE, Smith M, Rubinfeld B, Gutowski M; Beckmahn RP, and Polakis P. 
1 996 Subcellular localization and analysis of apparent 1 80-kDa and 220-kDa 
proteins of the breast cancer susceptibility gene, BRCA1. J. Biol. Chem. 1996 
271.28630-28635 



48. Hayashi, S., Rubinfeld, B., Souza, B., Polakis, P.. Wieschaus, E., and Levine, 
A. 1997 A DrosojDhiia homplog of the tumor suppressor adenomatous polyposis coli 



down-regulated p -catenin but its zygotic expression is not essen^al for t^^ 
regulation of annadillo. Prpc. Natl. Acad. Scl. 94, 242-247. . 

49. VIeminckx, K., Rublnfeld, B., Poiakis, P. and Gumbiner, B. 1 997 The ARC 
tumor suppressor protein Induces a new axis in Xenopus embryos. J. Cell. Biol. 
136,411-426. 

50. Rublnfeld, B., Robbins. P., El-Gamil, M.. Albert, I. » Porfiri, P^and Polakis, P. 
1997 Stabilization of p-catenln by genetic defects in melanoma cell lines. Science 
275,1790-1792. 

51. .Polak1$, P.\ The adenornatous polyposis cdi (APC) tumor suppressor. 1997 . 
Blochem. Blophys. Acta, 1332. F127tF147. 

52. Rublnfeld, B., Albert, l„ Porfiri, E., Munemitsu, S., and PoJakis, P 1997 Loss of 
P'catenin regulation by the APC tumor suppressor protein correlates with loss of 
streicture due to common somatic mutations of the gene. Cancer R6s. 57* 4624r 

.463^). 

53. Porfiri, E., Rublnfeld, B., Albert, I., Hovanes. K., VVaterman, M., and Polakis, P, 
.1997 induction . of a p-caten1n-LEF-1 complex by wnt-1 and transforming mutants of 
p-catenln. Oncogene 15; 2833-2839. 

54. Thomas JE, Smith M; Tonkinson JL, Riibinfeld B, and Pplakls P., 1997 
Inductfon of phosphorylation on BRCA1 during the ceitcyde and after DNA damage. 
Cell Growth Differ. 8,801-809. 

55. Hart. M., de los Santos, R., Albert, I., Rublnfeld, B., and Polakis P., 1998 Down 
reguliation of p-caten|n by human Axin and its association \Mth tt>e adenomatous 
polyposis coll <APC) tumor suppressor, p-catenin and glycogen synthasie kinase 3p. 
Current Biology 8, 573-581 . 

56. Polakis, P. 1998 The oncogenic activation of p-catenin. Current Opinipnat |n 
Geneti<fsandDeveiopmerir9;i&-21 . . . 

57. Matt Hart, Jean-Paul Concordet, Irina Lassot, Iris Albert, Rico del los Santos, 
Herve Durand, Christine Pen-et, Bonnee Rubinfled, Florence Margottin. Richard . 
Benarous and Paul Polakis. 1999 The F-box protein p-TrCP associates with 
phosphorylated p-catenin and regulates Its activity in tlie cell. Current Biology 9, 
207-10. 

58. Howard C. Crawford, Barbara M. Fingleton, Bonnee Rublnfeld, Paul Polaicis 
and L,ynn M. Matrislah 1999 The metalloproteinase rnatrilysin is a targ^ of 
p-ciatenin transactivation In intestinal tumours. Oncbgeno 18, 2883r9l! 

.59. Meng J, Glick JL, Polakis P, Casey PJ. 1999 Functional interaction between 
Galphd(z) and RaplGAP suggests a novel fomn of cellular cross^lk. J Biol Chem. 
17,36663-9 



€0. Vijayasurian Easwaran, Virginia Song, Paul Polakis and Steve Byers 1999 The 
ubiquitln-prbteosome pathway and serine kinase .activity modutate APC mediated 
regulation of p-catehin-LEF.8ignaiinig. J. &iol. Chem;274(^):1 6641-5! 

61 Polakis P. IHart I\i1 and Rubihfeld B. 1999 Oefedts in the regulation of beta- 
■catenln ' . / . i 

in colorectal cancer. Adv Exp Med Bfol. 470, 23-32 . 

62 Shen Z, Batzer A, Koehier JA, Polakis P, Schlessinger J, Lydon hfe, Moran MF. 
1999 Evkience ibr SHi3 (domain directed binding and phosphorylation of Siani68 by 
Src. Oncogene. 18. 4647<^3 



64. Thomas GM, Frame S, Goedert M, Nathke I, Polakis P, Cohen P. 1999 A 
GSK3- binding peptide from FRAT1 selectively inhibits the GSK3-catalysed 
phosphorylation of axin and beta-catenin. I^EBS Lett 458, 247-51. 

65. Peifer M, Polakis P. 2000 Wnt signaling Iri oncogenesis and.embryogenesis-^i 
look outskJe the nucleus. Science 287,1606-9, 

66. Polakis P. 2000 VVht signaling and cancer. Genes DGjv;14. 1837^1851. 

67. Spink. KE, Polakis P, Weis Wl 2000 Structural basis of the Asdn-adenomatous 
polyposis coli interactiori. EMBO J 19, 2276-^79. 

68. Szeto , W.. Jiang, W., Tice, D.A., Rublnfeld, B., Hollingshead, P.G., Fong, S.E.,. 
Dugger, D.L, Pham. T:, Yansura, D.E., Wong, T.A., Grimaldi, J.C., Coipuz, R.t., 
Singh J.S...Fraht2. G.D., Devaux, B., Crowley, C.W., Sdiwall, R.H., Eberhard, 

DJK., 

Rastelli, L, Polakis,. P! and Pennica, D. 2001 Overexpiression of the Retinoic 
Acid- ' .. 

Responsive Gene $tra6 in Human Cancers and its Synergistic Induction by Wnt-1 
and 

Retinoic Acjd. Cancer Res 613*197^204. , 

69. Rublnfeld B, Tice DA, Polakis P. 2001 Axin dependent phosphorylation of the 
adenomatous polyposis coii protein mediated by casein kinase 1 epsiibn. J Biol 

Ciierii 

. 276, 39037-39045. 

70. Polakis P. 2001 More than one way to skin a catenin. Cell 2001 105, 563^566. 

71. Tice DA, Soloviev I, Polakis P. 2002 Activation of the Wnt Pathway Interferes 
witiiSerum Response Element-driven Transcriptton of Immediate Early Genes. J 

Biol. 

Chem. 277. 61 18-6123. 



72. Tice DA, Szeto W. Solovlev I, Rublnfeld B, Fong SE* Diigger DL, Wirier J, 



mwams PM, Wieand D, Smith V. Schwali RH, Pennnica D, Polakis P. 2002 
Synprglstic activation of tumor antigens by wnt-1 signaling and retinolo acid leveafed ' ■ , 
by gene expression profiling. J Biol Cham. 277>14329-14335. 

73. Polakis, P. 2002 Casein kinase I: A wnfer of disconhed. Cum Bfpl. 12. R499. 

74^ Mao,W. , Luis, E., Ross, S., Siivaj J., Tan, C, Crowley, C, Chui, C. Franz. G., 
Senter, P., Koeppen, H., Polakts, P. 2004 EphB2 as a therapeutic antibody drug 
target for the treatment of colorectal cancer. Cancer Res. 64, 781-788. 

75. ShibamotQ, S.,.Winer, J;^ Williams, M., Polalds, P. 200a A Blockade In W 
signaling is activated following the differentiation of F9 teratocarcinoma cells. Exp^ 
Cell Res. 29211-20, 

76. Zhang Y, Eberhard DA, Frantz GD, Dowd P, Wu TD, Zhou Y, Watanabe C, LOoh SM, Polakis P, 
' Hillan KJ, Wood Wl, Zhang Z. 2004 GEPISr-quantitatlve gene expression profiting in normai and 
cancer tissues. Biofnformatlcs, April 8 



MOLECULAR BIOLOGY OF 

THE CELL 

TfflRD EDITION 



Text Editor Miranda Robertson 
Managing Editor: Ruth Adams 
Dlustraton Nigel Orme 

Molecular Model Drawings: Kate Hesketh-Moore 
Director of Electronic Publishing: John M-Roblin 
Computer Specialist: Chuck Bartelt 
Disk Preparation: Carol Winter 
Copy Editor: Shirley M. Cobert 
Production Editor. Douglas Goertzen 
Production Coordinator Perry Bessas 
Indexer Maija Hinkle 

Bruce Alberts received his Ph;D. from Harvard University and is 
currently President of the National Academy of Sciences and Professor 
of Biochemistry and Biophysics at the University of California, San 
Francisco. Dennis Bray received his Ph.D. &om the Massachusetts 
Institute of Technology and is currently a Medical Research Council 
Fellow in the Department of Zoology, University of Cambridge. 
Julian Lewis xecefved his D.Phil, from the University of Oxford and is 
currently a Senior Scientist in the Imperial Cancer Research FUnd 
Developmental Biology Unit, University of Oxford, Martin Raff received 
his MJD. firom McGill University and is currently a Professor in the MRC 
Laboratory for Molecular Cell Biology and the Biology Department, 
University College London. Keith Roberts received his Ph.D. from the 
University of Cambridge and is currently Head of the Department of Cell 
Biology, the John Innes Institute, Norwich. Tames D. Watson received his 
Ph.D. from Indiana University and is currently Director of the Cold Spring 
Harbor Laboratory. He is the author of Molecular Biology of the Gene and, 
with Francis Crick and Maurice Wilkins, won the Nobel Prize in Medicine 
and Physiology in 1962. 



© 1983, 1989, 1994 by Bruce Alberts, Dennis Bray, Julian Lewis, 
Martin Raff, Keith Roberts, and James D. Watson. 

All rights reserved. No part of this book covered by the copyright hereon 
may be reproduced or used in any form or by any means — graphic, 
electronic, or medianical, including photocopying, recording, taping, or 
information storage and retrieval systems— without permission of the 
publisher. 



Ubxary of Congress Cataloglng-in-Publication Data 
Molecular biology of the cell / Bruce Alberts . . . [et a].].-^3rd ed. 
p. cm. 

Includes bibHographical references and index. 

ISBN 0-8153-1619-4 (hard cover).— ISBN 0-8153-1620-8 (pbk,) 

1. Cytology. 2. Molecular biology. I. Alberts, Bruce. 

[DNLM: L Cells. 2. Molecular Biology. QH 581.2 M718 1994] 
QH581.2.M64 1994 
574.87— dc20 
DNLM/DLC 

for Library of Congress 93-45907 

CIP 

Published by Garland Publishing, Inc. 
717 Fifth Avenue, New York, NY 10022 

Printed in the United States of America 
15 14 13 12 10 9 8 7 



Front coven The photograph shows a rat nerve cell 
In culture. It is labeled ( yellow ) with a fluorescent 
antibody that stains its ceU body and dendritic 
processes. Nerve temiinals ( green ) from other 
neurons (not visible), which have made synapses on 
the cell, are labeled with a different antibody. 
(Courtesy of Olaf Mimdigl and Pietro de Camilli.) 

Dedication page: Gavin Borden, late president 
of Garland Publishing, weathered in during his 
mid-1980$ dimb near Mount McKinley with 
MBoC author Bruce Alberts and famous mountaineer 
guide Mugs Stump (1940*1992). 

Back coven The authors, in alphabetical order, 
crossing Abbey Road in London on their way to lunch. 
Much of this third edition was written in a house just 
aroimd the comer. (Photograph by Richard Olivier.) 



^cts. If these minor ceil proteins differ among cells to the same extent as the 
more abimdant proteins, as is commonly assumed, only a small number of pro- 
tein dififer^"^^^ (perhaps several himdred) suffice to create very large differences 
in cell moiphology and behavior. 

A Cell Can Change the Expression of Its Genes 
in Response to External Signals ^ 

Most of the specialized cells in a multicellular organism are capable of altering 
their patterns of gene expression in response to extracellular cues. If a liver cell 
is exposed to a glucocorticoid hormone, for example, the production of several 
specific proteins is dramatically increased. Glucocorticoids are released during 
periods of starvation or intense exercise and signal the liver to increase the 
production of glucose from amino acids and other small molecules; the set of 
proteins whose production is induced includes enzymes such as tyrosine amino- 
transferase, which helps to convert tyrosme to glucose. When the hormone is no 
longer present, the production of these proteins drops to its normal level. 

Ottier cell types respond to glucocorticoids in different ways. In fat ceDs, for 
example, the production of tyrosine aminotransferase is reduced, while some 
other cell types do not respond to glucocorticoids at all. These examples illiistrate 
a general feature of ceD specializatiorv^-different cell types often respond in dif- 
ferent ways to the same extracellular signal. Underlying this specialization are 
features that do not change, which give each cell type its permanently distinc- 
tive character. These features reflect the persistent expression of different sets of 
genes. 



Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protem ^ 

If differences between the various cell types of an organism depend on the par- 
ticular genes that the cells express, at what level is the control of gene expression 
exercised? There are many steps in the pathway leading from DNA to protein, and 
all of them can in principle be regulated. Thus a cell can control the proteins it 
makes by (1) controlling when and how often a given gene is transcribed (tran- 
scriptional control), (2) controlling how the primary RNA transcript is spliced or 
otherwise processed (RNA processing control), (3) selecting which completed 
mRNAs in the cell nucleus are exported to the cytoplasm (RNA transport con- 
trol), (4) selecting which mRNAs m the cytoplasm are translated by ribosomes 
(translational control), (5) selectively destabilizing certain mRNA molecules In 
the cytoplasm (mRNA degradation control), or (6) selectively activating, inacti- 
vating, or compartmentalizing specific protein molecules after they have been 
made (protein activity control) (Figure 9^2). 

For most genes transcriptional controls are paramount. This makes sense 
because, of all the possible control points illustrated in Figure 9^2, only transcrip- 
tional control ensures that no superfluous intermediates are synthesized. In the 
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Figure 9-2 Six steps at which 
eucaryote gene expression can be 
controlled. Only controls that operate 
at steps 1 through 5 are discussed in 
this chapter. The regulation of protein 
activity (step 6) is discussed in 
Chapter 5; this includes reversible 
activation or inactivation by protein 
phosphorylation as well as 
irreversible inactivation by proteolytic 
degradation. 
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foDowing sections we discuss the DNA and protein components that regulate the 
initiation of gene transcription. We return at the end of the chapter to the other 
ways of regulating gene expression. 

Summary 

Jlte genome of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules, A cell typically expresses only d 
fraction ofits genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed Moreover, cells can change the pattern 
of genes they express in response to dumges in their environment, such as signals flom 
other cells. Although all of the steps involved in expressing a gene can in principle be 
regulated, for most genes the initiation of RNA transcription is the most important 
point of control 



DNA-binding Motifs in Gene 
Regulatory Proteins ^ 

How does a cell determine which of its thousands of genes to transcribe? As dis- 
cussed in Chapter 8, the transcription of each gene is controlled by a regulatory 
region of DNA near the site where transcription begins. Some regulatory regions 
are simple and act as switches that are thrown by a single signal. Other regula- 
tory regions are complex and act as tiny microprocessors, responding to a vari- 
ety of signals that they interpret and integrate to switch the neighboring gene on 
or off. Whether complex or simple, these switching devices consist of two fun- 
damental types of components: (1) short stretches of DNA of defUied sequence 
and (2) gene regulatory proteins that recognize and bind to them. 

We begin our discussion of gene regulatory proteins by describing how these 
proteins were discovered. 



Geiie Regulatory Proteins Were Discovered Using 
Bacterial Genetics ^ 

Genetic analyses in bacteria carried out in the 1950s provided the first evidence 
of the existence of gene regulatory proteins that turn specific sets of genes on 
or off. One of these regulators, the lambda repressor, is encoded by a bacterial 
virus, bacteriophage lambda. The repressor shuts off the viral genes that code for 
the protein components of new virus particles and thereby enables the viral ge- 
nome to remain a silent passenger in the bacterial chromosome, multiplying with 
the bacterium when conditions are favorable for bacterial growth (see Figure 
6-80), The lambda repressor was among the first gene regulatory proteins to be 
characterized> and it remains one of the best understood, as we discuss later. 
Other bacterial regulators respond to nutritional conditions by shutting off genes 
encoding specific sets of metabolic enzymes when they are not needed. The lac 
repressor, for example, the first of these bacterial proteins to be recognized, turns 
off the production of the proteins responsible for lactose metabolism when this 
sugar is absent from the medium. 

The first step toward imderstanding gene regulation was the isolation of 
mutant strains of bacteria and bacteriophage lambda that were unable to shut 
off specific sets of genes. It was proposed at the time, and later proved, that most 
of these mutants were deficient in proteins acting as specific repressors for these 
sets of genes. Because these proteins, like most gene regulatory proteins, are 
present in small quantities, it was difficult and time-consuming to isolate them. 
They were eventually purified by fi:actionating cell extracts on a series of stan- 
dard chromatography columns Csee pp. 166-169). Once isolated, the pro- 
teins were shown to bind to specific DNA sequences close to the genes that they 
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Figure 9-3 Double-heUcal structure 
of DNA- The major and minor grooves 
on the outside of the double helix. ai« 
indicated. The atoms are colored as 
follows: carbon, dark Wue; nitrogefli 
light blue; hydrogen, white; oiqfg^^' : 
red; phosphoms, yellow. 
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Figure 9-71 A mechanism to ei^Iaiii 
both the marked deficiency of CG 
sequences and the priesence of CG 
islands in vertebrate genomes, A 

black linemdiks the location of an 
immeth^ated CG dfaiudeotide in the 
DNA sequence, while a red line masks 
the location of a methylated CG 
dinucleotide. 
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Summary 

The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different cells. Since many spe- 
daUzed animal cells can maintain their unique character when grown in culture, the 
^ regulatory mechanisms involved in creating them must be stable once estab- 
lished and heritable when the cell divides, endowing the cell with a memory of its 
developmental history. Procaryotes arui yeasts provide unusually accessible model 
systems in which to study gene regulatory mechanisms, some of which may be rel- 
evant to the creation of specialized cell types in higher eucaryotes. One such mecha- 
nism involves a competitive interaction between two (or more) gene regulatory pro- 
teins, each of which inhibits the synthesis of the other; this can create a flip-flop 
switch that switches a cell between two alternative patterns of gene expression. Di- 
rect or indirect positive feedback loops, which enable gene regulatory proteins to 
perpetuate their own synthesis, provide a general mechanism for cell memory. 

In eucaryotes gene transcription is generally controlled by combinations of gene 
regulatory proteins. It is thought that each type of cell in a higher eucaryotic organism 
contains a specific combination of gene regulatory proteins that ensures the expres- 
don of only those genes appropriate to that type of cell A given gene regulatory pro- 
tein may be expressed in a variety of circumstances and typically is involved in the 
relation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also utilized by eucaryotic cells to regulate gene expression. In ver- 
tebrates DNA methylatipn also plays a part, mainly as a device to reinforce decisions 
oboutgene expression that are made initially by other mechanisms. 



Postttanscriptional Controls 

Although controls on the initiation of gene transcription are the predominant 
fonn of regulation for most genes, other controls can act later in the pathway 
from RNA to protein to modulate the amount of gene product that is made. Al- 
though these postCranscriptionai controls, which operate after RNA polymerase 
has bound to the gene's promoter and begun RNA synthesis, are less common 
^^transcriptional control, fox many genes they are crucial. It seems that every 
step in gene expression that could.be controlled in principle is likely to be regu- 
ated under some chcumstances for some genes. 

We consider the varieties of posttranscriptional regulation in temporal or- 
accordir^ to the sequence of events that might be experienced by an RNA 
"Molecule after its transcription has begun (Figure 9-72). 
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expression. Only a few of these 
controls are likely to be used for any 
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Figure 6-3 Genes can be ^cpresseil 
with different efficiencies* Gene A Is 
transcribed and translated much more 
efHclendy than gene B.This allows the 
amount of protein A in the cell to be 
much greater thm that of protein B. 
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FROM DNA TO RNA 

Itanscription and translation are the means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each RNA molecule can direct the synthesis 
of many identical protein molecules, cells can synthesize a large amount of 
protein rapidly when necessary But each gene can also be transcribed and 
translated with a different efficiency, allowing the cell to make vast quantities of 
some proteins and tiny quantities of others (Figure 6-3). Moreover, as we see in 
the next chapter, a cell can change (or regulate) the expression of each of its 
genes according to the needs of the moment— most obviously by controlling 
the production of its RNA. 



Portions of DNA Sequence Are Transcribed into RNA 

The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence — a gene — into an 
RNA nucleotide sequence. The information in RNA, although copied into ano&er 
chemical form, is still written in essentially the same language as it is in DNA — 
the language of a nucleotide sequence. Hence the name transcription. 

Like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (Figure 6-4). It differs from 
DNA chemically in two' respects: (1) the nucleotides in RNA are 
ribonucleotides— that is, they contain the sugar ribose (hence the name ribonu- 
cleic acid) rather than deojcyribose; (2) although, like DNA, RNA contains the 
bases adenine (A), guanine (G), and cytoshie (Q, it contains the base luadl (U) 
instead of the thymine (T) in DNA- Since U, like T, can base-pair by hydrogen- 
bonding with A (Figure 6-5), the complementary base-pairing properties 
described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with 
C, and A pairs with U). It is not imcommon, however, to find other types of base 
pairs in RNA: for example, G pairii^ with U occasionally 

Despite these small chemical differences, DNA and RNA differ quite dra- 
matically in overall structure. Whereas DNA always occurs in cells as a double- 
stranded helix, RNA is single-stranded. RNA chains therefore fold up into a 
variety of shapes, just as a polypeptide chain folds up to form the final shape of 
a protein (Figure 6-6) . As we see later in this chapter, the abflity to fold into com- 
plex three-dimensional shapes allows some RNA molecules to have structural 
and catalytic functions. 

Transcription Produces RNA Complementary to 
One Strand of DNA 

All of the RNA in a cell is made by DNA transcription, a process that has cer- 
tain similarities to the process of DNA replication discussed in Chapter 5. 
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Chapter 6 : HOW CELLS READ THE GENOME: FROM DNATO PROTEIN 




Figure 6-89 Protein aggregates that cause human disease. (A) Schematic illustration of the type of 
conformational change in a protein that produces material for a cros5-t>eta filament. (B) Diagram illustrating 
the selMnfectious nature of the protein aggregation that Is central to prion diseases. PrP is highly unusual 
because the misfolded version of the protein* called PrP*, induces the nbrmal PrP protein it contacts to 
change Cts conforniation, as shown. Most of the human diseases caused by protein aggregation are caused by 
the overproduction of a variant protein that is especially prone to ag^gation, but because this structure Is 
not infectious in this way; it cannot spread from one animal to another. (Q Dnwving of a cross-beta filament, 
a common type of protease-resistant protein aggr^g^te found in a variety of human neurolo^cal diseases. 
Because the liydrogen-bond interactions in a P sheet form between polypeptide baddsone atoms (see Figure 
3-9), a number of different abnormally folded proteins can produce this structure. (D) One of several 
possible models for the conversion of PrP to PrP*, shov/ing the lllcely ctiange of two ot-helices into four 
^sQ^ds. Although the structure of the normai protein has been determined accurately, the structure of the 
infectious form Is not yet lotown with certainty because the aggregation has prevented the use of standard 
structural techniques. (C, courtesy of Ijoulse Serpell, adapted from M. Sunde et al^j. Mcl Biol 273:729-739. 
1 997; adapted from S.B. Prusiner, Trends Biochcm. Sd. 21:482-487, 1 996.) 

animals and humans. It can be dangeious to eat the tissues of animals that con- 
tain PrP*, as witnessed most recently by the spread of BS£ (commonly refencd 
to as the "mad cow disease"] firom catde to humans in Great Britain. 

Fortunately, in the absence of PrP*> PrP is extraordinarily difiBcult to convert 
to its abnormal form. Although very few proteins have the potential to misfold 
into an infectious conformation, a similar transformation has been discovered 
to be the cause of an otherwise mj^terious 'protein-only inheritance** observed 
in yeast cells. 

There Are Many Steps From DNA to Protein 

We have seen so far in this chapter that many different types of chemical reac- 
tions are required to produce a properly folded protein from the information 
contained in a gene CFigure 6-90). The final level of a properly folded protein in 
a cell therefore depends upon the efficiency with which each of the many steps 
is performed. 

We discuss in Chapter 7 that cells have the ability to change the levels of 
their proteins according to their needs. In principle, any or all of the steps in Fig- 
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Figure 6-90 The production of a 
protein by a eucaryotic cell. The final 
level of each protein In a eucaryotic cell 
depends upon the efTidenc/ of each step 
depicted. 



uce &-90] couid be regulated by the cell for each individual protein. However, as 
we shall see in Chapter 7, the initiation of transcription is the most common 
point for a cell to regulate the expression of each of its genes. This makes sense, 
inasmuch as the most efficient way to keep a gene from being expressed is to 
block the very first step— the transcription of its DNA sequence into an BNA 
molecule. 



Summary 

The transkaion of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytoplasm on a large ribonucleoprotein assembly called a ribosome. The 
amino adds used for protein synthesis are first attached to a family of tRNA 
molecules, each of which recognizes, by complementary base^pair interactions, par- 
ticular sets of three nucleotides in the mBNA (codons). The sequence of nucleotides in 
the mRNA is then read from one end to the other in sets of three according to the 
genetic code. 

To initiate translation, a small rihosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule, A 
large ribosomal subunit binds to complete the ribosome and be^n the elongation 
phase of protein synthesis. During this phase, aminoacyl tRNAs-^each bearing a 
specific amino acid bind sequentially to the appropriate codon in mRNA by forming 
complementary base pairs with the tRNA anticodon. Each amino acid is added to the 
C'terminal end of the growir^ polypeptide by means of a cycle of three sequential 
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Figure 7-^ Six steps at which 
eucaryotic gene expression be 
controlled. Controls that operate at 
steps i through 5 are discussed ip this 
chapter. Step 6, the regulation of protein 
activity, Includes reversible activation or 
inactlvation by protein phosphorylation' 
(discussed In Chapter 3) as well as 
irreversible inactlvation by proteolytic 
degradation (discussed in Chapter 6). 



Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protein 

If differences among the various cell types of an organism depend on the partic- 
ular genes that the cells express, at what level is the control of gene e:q)ression 
exercised? As we saw in the last chapter, there are many steps in the pathway 
leading from DNA to protein, and all of them can in principle be regulated. Thus 
a cell can control the proteins it makes by (1) controlling v^en and how often a 
given gene is transcribed (transcriptional control), (2) controlling how the RNA 
transcript is spliced or otherwise processed (WNA processing control), (3) 
selecting which completed mRNAs in tiie cell nucleus are exported to the cytosol 
and determining where in the cytosol they are localized (RNA transport and 
localization control), (4) selecting which mRNAs in the cytoplasm axe translated 
by ribosomes (translational control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6) selectively acti- 
vating, inactivating, degrading, or compartmentalizing specific protein 
molecules after they have been made (protein activity control) (Figure 7-5). 

For most genes transcriptional controls are paramount This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only tran- 
scriptional control ensures that the cell will not synthesize superfluous interme- 
diates. In the following sections we discuss the DNA and protein components 
that perform this function by regulating the initiation of gene transcription. We 
shall return at the end of the chapter to the additional ways of regulating gene 
expression. 

Summary 

Tke genome of a cell contains in its DNA sequence the information to make marty 
thousands ofd^Jerent protein and RNA molecules. A cdi typically expresses only a 
fraction of its genes, and the different types of cells in mulHceUular organisms arise 
because different sets of genes are expressed. Moreover, ceUs can change the pattern 
of genes they express in response to changes in their environment, such as sl^uds 
from other cells, AUhoi^ all of the steps involved in a^ressinga gene can in prin- 
ciple be regulated, for most genes the initiation of RNA tnmscription is the most 
. Important point of control 



DNA-BINDING MOTIFS IN GENE REGULATORY 
PROTEINS 

How does a cell determine which of its tiiousands of genes to transcribe? As 
mentioned briefly in Chapters 4 and 6, tiie transcription of each gene is con- 
^oUed by a regulatory region of DNA relatively near the site where transcription 
begins. Some regulatory regions are simple and act as switches that are thrown 
oy a singje signal. Many others are complex and act as tiny microprocessors, 
responding to a variety of signals tiiat they interpret and integrate to switch tiie 
neighboring gene on or oflf. Whether complex or simple, these switching devices 
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occur in the genn line, the cell lineage that gives rise to spenn or eggs. Most of 
the DNA in vertebrate germ cells is inactive and highly methylated. Over long 
periods of evolutionary time, the methylated CG sequences in these inactive 
regions have presumably been lost through spontaneous deamination events 
that were not properly repaired. However promoters of genes that remain active 
in the germ cell lineages (including most housekeeping genes} are kept 
unmethylated, and therefore spontaneous deaminations of Cs that occur with- 
in them can be accurately repaired Such regions are preserved in modem day 
vertebrate cells as CG islands. In addition, any mutation of a CG sequence in the 
genome that destroyed the function or regulation of a gene in the adult would be 
selected against, and some CG islands are simply the result of a higher than nor- 
mal density of critical CG sequences. 

The mammalian genome contaias an estimated 20,000 CG islands. Most of 
the islands mark the 5' ends of transcription units and thus, presumably, of 
genes. The presence of CG islands often provides a convenient way of identify- 
ing genes in the DNA sequences of vertebrate genomes. 

Summary 

The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different cells. Since many 
specialized animal cells can maintain their unique character through many cell 
division cycles and even when grown in culture, the gene regulatory mechanisms 
involved in creating them must be stable once established and heritable when the 
ceUdiuides. These features endow the cell with a memory of its developmental history. 
Bacteria and yeasts provide unusualfy accessible model systems in which to study 
gene regulatory mechanisms. One such mechanism involves a competitive interac- 
tion between two gene regulatory proteins, each of which inhibits the synthesis ofOte 
odien this can create a flip-flop switch thdia switches a ceU between two alternative 
patterns of gene expression. Direct or indirea positive feedback loops, which enable 
gene regulatory proteins to perpetuate their own synthesis, provide a general mech- 
anism for cell memory. Negative Jkedback loops with programmed delays fi^rm the 
basis fi}r cellular clocks. 

In eucaryotes the transcription ofagene is generally controlled by combinations 
ofgerier^ulatory proteins. It is thought that each type of cell in a higher eucaryotic 
organism contains a specific combination of gene regulatory proteins that ensures 
the expression of only those genes appropriate to that type of cell A given gene regu- 
latory protein may be active in a variety of circumstances and typically is involved 
in the regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also used by eucaryotic cells to regulate gene expression. An espe- 
ciaify dramatic case is the inactivation of an entire X chromosome in female mam- 
mals. In vertebrates DNA methylation also funcdons in gene regulation, being used 
mainly as a device to rdnfbrce decisions about gene expression that are made ini- 
tially by other mechanisms. DNA methylaUon also underlies the phenomenon of 
genomic ittiprinting in mamrruds, in which the expression of a gene depends on 
whether U was inhaited from the mother or the fiuhen 



POSTTRANSCRIPTIONAL CONTROLS 

hi principle, every step required for the process of gene expression could be 
controlled. Indeed, one can find examples of each type of regulation, although 
any one gene is likely to use only a few of them. Controls on the initiation of 
gene transcription are the predominant form of regulation for most genes. But 
other controls can act later in the pathway from DNA to protein to modulate 
the amount of gene product that is made. Aldiough these posttranscriptional 
controls, which operate after RNA polymerase has bound to the gene's promoter 
and begun RNA synthesis, are less common than transcriptional control, for 
many genes they are crucial. 
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Figure 7-86 A mechanism to eixpEain 
both the marked overall defidency 
of CG sequences and th^r clustering 
into CG islands in .vertebrate 
genomes. A black Ibe marlcs the location 
of a CG dinudeotide in the DNA 
sequence, while a red "lollipop" Indicates 
the presence of a methyl group on the 
CG dinudeotide. CG sequences that lie In 
regulatory sequences of genes that are 
transcribed in germ cells are unmethylated 
and therefore tend to be retained In 
evolution Hethylaced CG sequences, on 
the opher hand, tend to be lost through 
deamination of 5-methyl C toT, unless the 
CG sequence is critical for survival. 
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CHAPTER 29 _ 

Regulation of transcriptioii 



Tli^ phi'nDt.vpIc dtlTei'encir.s i\\s\'m^\nih Ihe 
turtoiis kmtis or wits in n Uiglier t^itt;nryiile are 
l^f^ely due lo' (illTifrcriires in ihe expression of ' 
^rncs tt>fll code fnr pmteins. ilini j.s. Mio.se tran* 
^Ued hy RNA po^iDcrnse 11. In principle* tlie 
^i(prP5.siDi] or lliesif fiffWf mt^h\ lie rr^italed al 
any one of several .slaves. Tl>e c<mrepi (rf^ the 
4evel of conlroP iinpUe!^ tlint gene expression 
' Is nol necessarily an auloTi)aiic pi-oeess once il 
Itas be^ii. II cuulcl Uv regulated in a gene- 
5pecinc vr^y al any one (^r 5e%*eral <;equential 
$tej>$- ^^e can dlMiiif uisii (nt leH.si) rive poteii- 
lial conirol points, fnrrntn^ llie series: 

i 

Iniliali'iin of ti^iisrripitun 
i 

Pmit*5siii0 the tmiii^Tipj 
i 

ti'aiupor^ to eviO|>ljisin 
i 

TraiiMalitm of luHNA 

Tlie exf5lriire of llir firsl sle|i is iniplird liy 
Ihe drsroi'ery lliul ^eiies inny exisi in either of 
KVd stnieluml i-oiiditioiix. ilelnilve U> ihe Slale 
(if niosi of llie genome, genes »re Toutid in 
mi -nellve* 5l'aie in the i-ells In whiih Ihey 
an? exi>ressed (see Cl*svpter 2T). The cl\ange ot 
arutiure is distinct f^»ru liie aci «r m-inscrip* 
lion, and indicates (hai the gene is transcribe 
able." This suggests thai ac<|uisilion of Ihe 
"active* structure musl be Ibe first step in gene 
expression. 

Transcriptioii of a gene in the active stale is 



controlled al the sta^e of inlUation. thai Is. Uv 
the uUevaction of RNA polymerase with iis pni- 
moter. 1*his Is now becoming suscepill^e io 
analysis in the in vitrei s>^tejns (ser Ciiaptcr 
28). For most ^enes. this is a tnalor contrnl 
point: probably it is the most cuniinon ie\'ei of 
regnlallnn. 

There is at present do e\idence for control 
al subsequent stages of transcription In etikary- 
otic cells, for example., via antitemiinalion 
mechariistns. 

The primary transcript U mddlTied by capping 
at the 5* end, and usually also by polyadenyla- 
tion at the V end. imrons must tic spticcd out 
rruiu the transcripts of imerrnpied genes. The 
mature ilNA nutsi be espmied Pmni llic nucleus 
lo Hie c\'CoplA5iii. Reiailaiiuii of gette expression 
by selection of secjuences at the Irvef of nuclear 
HXA niigiu hiv4ih*4» any ojr all or these stages, 
but the one fur uliidi n't* have most evidence 
concern,'; rtiRnge.* in splicing: $oine genes are 
expressed by means of uUeniaiive splicing pot- 
terns vvhuse regulation controls Ihe ly))e of pro- 
tein procUirt (see Ctiapter ;«)). 

Finoily. Ihe irnnslatinn of an mRNA In the c>io- 
plasin can be specincally controlled. Tliere is little 
evidence for the emplu)-ineiil of tids mechonism in 
adult somatic cells* bu| it dues occur hi some 
enibr>»ntc sUiiaiions* as described h) Cbsiplcr T. 
- The meehauism Is presumed to Involve the block- 
ing of initiation of trnnslation of some mflNAs by 
specific protein factors. 

But having acknowledged that ccmtrol of gene 
expression can occur at multiple stages, dnd 
that production of ANA cannot inevitably be 
equated with production of protein, it Is dear 
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that the overwhelming majority of re^latoiy 
events occur at the initiation of iranscnptton. 
Regulation uf tlssue-specinc gene irajiscripiton 
ties at ttie heart or eukar>'otic difTerentialion; 
indeed, we see examples in Chapter 38 in 
which proteins liiai regulate embryonic devel- 
opment prove to be transcript ion Tactors. A reg- 
ulatory transcription factor serves to provide 



common control of a large number of target 
genes, and we sceli to answer two questions 
about this mode of reg;ulalian: wh^t identifies 
the common target genes to the transcripiton 
factor, and how is the activity of the Irnnscrlp- 
tion factor itself regulated in response to iiurin. 
sic or extrinsic signals? 



Response elements idcnlify genes under common 
regulation 



The principle thai emeiiges from chanCcterizIng 
groups of genes under common tonirol Is that 
ihcy share a pivtnoier clement that is recognized 
by a regulatoo' iranscn'psion factor. An element 
that causes d gene to respond to such a factor 
is celled a response element examples are the 
HSE (heat shocX response element), GBE 
(glucocorticoid response element)^ SRE (serum 
response element). 

The properties of some Inducible transcription 
factors and the elements that they recognize are 
summarized in. Table 29.1. Response elements 
have the same general characteristfcs as 
upstream elements of promoters or enhancers. 
Tliey contain short consensus sequences, and 
copies of the response elements found in dif- 
fercnt genes are closely relaied, but not neces- 
sarily identical. The region bound by the factor 
exiends for* a short dtsunce on either dde of 



Table 29,1 f ndjcit:te transcrpi.on /acicfs bind ic 
response elumenls (hat idcnl-fy groups t\ promclers 
or cjhancsrs sutject Jo coc;din<i!G cor.tf cl. 



Regulstoiy Agent Module Consensus 



Factor 



HeailhMh HSe CNNGAANNTCCNNG MSTTF 

GkiCOOOftfcOid CAE rGQTACAMTGTTCr Bflceplor 

PhcrtSDiBSter TRE TGMCTCA API 

Senin) SRE CCATA7TA6G SflF 



the consensus sequence. In pmmoters, the ele- 
ments are not present at fixed distances from 
the sianpoint, but are usually <200 Up upsUxam 
Of It, The presence of a single element usuaUy 
is sumdem to confer the regulatoiy r€sp<?nsf. 
but sometimes there are multiple copies. 

Response elements may be located in 
molers or in enhancers. Some types of etemenis 
are typically found in one rather than the ott>er. 
usually an HSE is found in a promoter. v^hH^ ' 
GR£ is found In an enhancer. We assume iM 
all response elements function by the sani' 
general principle. A gene is regulated ^ 
sequence at tfie protnoier or enhancer (hot 
recognized py a specific protein. The P'^^^* 
/unctions as a transaiprion factor needed 
HSA polytnerase to itittiate,' Active protein ^ 
avaiiahle anfy under conditions when Me. 
to be erprtssed; its absence means thai th( 
mater is noi actitusted by this parikuior ar^'* 

An example of a situation in which ^^^^^ 
genes are controlled hy a single factor Is 
\1ded by the heat shock response. This »s 
mon to a wide range of prokor^otes ^ 
eukar>*ote5 and involves multiple ^^^^^^^^ 
fcnc expression; an increase in temp^'^ ^j, 
turns off transcription of some genes, ^^^'^^^d 
transcription of the heat shock g^'^^'^^.c 
causes changes in the translation of ^ 
The control of the heal shoeV genes ili^^ 
the differences between prokaryoiic 
eukaryotic modes of control in. bacteria, ^ 
Sigma factor is synthesized lhal d'^^'^^p^r 
polymerase holoenzyme to recognize t^^- 
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Abstract 

Background: Prostate stem cell antigen (PSCA) is a recently defined homologue of the Thy-I/Ly.6 family of 
glycosylphosphatidylinositol (GPI)-anchored cell surface antigens. The purpose of the present study was to 
examine the expression status of PSCA protein and mRNA in cHnical specimens of human prostate cancer (Pea) 
and to validate it as a potential molecular target for diagnosis and treatment of Pea. 

Materials and Methods: Immunohistochemical (IHC) and in sttu hybridization (ISH) analyses of PSCA 
expression were simultaneously performed on paraffin-embedded sections from 20 benign prostatic hyperplasia 
(BPH), 20 prostatic intraepithelial neoplasm (PIN) and 48 prostate cancer (Pea) tissues, including 9 androgen- 
independent prostate cancers. The level of PSCA expression was semlquantltadvely scored by assessing both the 
percentage and Intensity of PSCA-positive staining cells in the specimens. Then compared PSCA expression 
between BPH, PIN and Pea tissues and analysed the correlations of PSCA expression level with pathological grade, 
clinical stage and progression to androgen-independence in Pea. 

Results: In BPH and low grade PIN. PSCA protein and mRNA staining were weak or negative and (ess intense 
and uniform than that seen in HGPIN and Pea. There were moderate to strong PSCA protein and mRNA 
expression In 8 of 1 1 (72.7%) HGPIN and In 40 of 48 (83.4%) Pea specimens examined by IHC and ISH analyses, 
with statistical significance compared with BPH (20%) and low grade PIN {22.2%) samples (p < 0,05. respectively). 
The expression level of PSCA increased with high Gleason grade, advanced stage and progression to androgen- 
independence (p < O.OS, respectively). In addition, IHC and ISH staining showed a high degree of correlation 
between PSCA protein and mRNA overexpression. 

Conclusions: Our data demonstrate that PSCA as a new cell surface marker is overexpressed by a majority of 
human Pea. PSCA expression correlates positively with adverse tumor characteristics, such as increasing 
pathological grade (poor cell differentiation), worsening clinical stage and androgen-independence. and 
speculatively with prostate carcinogenesis. PSCA protein overexpression results from upregulated transcription 
of PSCA mRNA. PSCA may have prognostic utility and may be a promising molecular target for diagnosis and 
treatment of Pea. 
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Introduction 

Prostate cancer (Pea) is the second leading cause of can- 
cer-related death in American men and is becoming a 
common cancer inaeasing in China. Despite recently 
great progress in the diagnosis and management of local- 
ized disease, there continues to be a need for new diagnos- 
tic markers that can accurately discriminate between 
indolent and aggressive variants of Pea. There also contin- 
ues to be a need for the identification and characterization 
of potential new therapeutic targets on Pea cells. Current 
diagnostic and therapeutic modalities for recurrent and 
metastatic Pea have been limited by a lack of specific tar- 
get antigens of Pea. 

Although a number of prostate-specific genes have been 
identified (i.e. prostate specific antigen, prostatic add 
phosphatase, glandular kail ikrein 2), the majority of these 
are seaeted proteins not ideally suited for many immuno- 
logical strategies. So, the identification of new cell surface 
antigens is critical to the development of new diagnostic 
and therapeutic approaches to the management of Pea. 

Reiter R£ et al [1] reported the idenrification of prostate 
stem cell antigen (PSCA), a cell surface antigen that is pre- 
dominantly prostate specific. The PSCA gene encodes a 
123 amino acid glycoprotein, with 30% homology to 
stem cell antigen 2 (Sea 2). Like Sea-2, PSCA also belongs 
to a member of theThy-l/Ly-6 family and is anchored by 
a glycosylphosphatidylinositol (GPI) linkage. mRNA in 
situ hybridization (ISH) localized PSCA expression in nor- 
mal prostate to the basal cell epithelium, the putative 
stem cell compartment of prostatic epithelium, suggesting 
that PSCA may be a marker of prostate stem/progenitor 
cells. 

In order to examine the status of PSCA protein and mRNA 
e^qpression in human Pea and validate it as a potential 
diagnostic and therapeutic target for Pea, we used immu- 
nohistochemistry (IHC) and in situ hybridization (ISH) 
simultaneously, and conducted PSCA protein and mRNA 
expression analyses in paraffin-embedded tissue speci- 
mens of benign prostatic hyperplasia (BPH, n « 20), pros- 
tate intraepithelial neoplasm (PIN, n - 20) and prostate 
cancer (Pea, n = 48). Furthermore, we evaluated the possi- 
ble correlation of PSCA expression level with Pea tumori- 
genesis, grade, stage and progression to androgen- 
independence. 

Materials and methods 
Tissue samples 

All of the clinical tissue specimens studied herein were 
obtained from 80 patients of 57-84 years old by prostate- 
ctomy, transurethral resection of prostate (TURP) or biop- 
sies. The patients were classified as 20 eases of BPH, 20 
cases of PIN, 40 cases of primary Pea, including 9 patients 



with recuirent Pea and a history of androgen ablation 
therapy (orchiectomy and/or hormonal therapy), who 
were referred to as androgen-independent prostate can- 
cers. Eight specimens were harvested from these andro- 
gen-independent Pea patients prior to androgen ablation 
treatment. Each tissue sample was cut into two parts, one 
was fcced in 10% formalin for IHC and the other treated 
with 4% paraformaldehyde/0.1 M PBS PH 7.4 in 0.1% 
DEPC for 1 h for ISH analysis, and then embedded in par- 
affin. All paraffin blocks examined were then cut into 5 
lita sections and mounted on the glass slides specific for 
IHC and ISH respecdvely in the usual fiashlon. H&E- 
stained section of each Pea was evaluated and assigned a 
Gleason score by the experienced urological pathologist at 
our insritution based on the criteria of Gleason score 12). 
The Gleason sums are summarized in Table 1. Clinical 
staging was performed according to lewett-whitmore- 
proul staging system, as shown in Table 2. In the category 
of PIN, we graded the specimens into two groups, i.e. low 
grade PIN (grade I - TI) and high grade PIN (HGPIN, 
grade III) on the basis of literatures |3,4|. 

tmmunohistochemlcal (IHC) analysis 
Briefly, tissue sections were deparaffinized, dehydrated, 
and subjected to microwaving in 10 mmol/L citrate 
buffer, PH 6.0 (Boshide, Wuhan, China) in a 900 W oven 
for 5 min to induce epitope retrieval. Slides were allowed 
to cool at room temperature for 30 min. A primary mouse 
antibody specific to human PSCA (Boshide, Wuhan, 
China) with a 1 : 1 00 dilution was applied to incubate with 
the slides at room temperature for 2 h. Labeling was 
detected by sequentially adding biotinylated secondary 
antibodies and strepavidin-peroxidase, and localized 
using 3,3'-diaminoben2idine reaction. Sections were then 
counterstained with hematoxylin. Substitution of the pri- 
mary antibody with phosphate-buffercd-saline (PBS) 
served as a negative-staining control. 

mRNA \n situ hybridiiotlon (ISH) 

Five-^im-thick tissue sections were deparaffinized and 
dehydrated, then digested in pepsin solution (4 mg/ml in 
3% ciu^ic acid) for 20 min at 37.5 "C, and further proc- 
essed for ISH. Oigoxigenin-labeled sense and antisense 
human PSCA RNA probes (obtained from Boshide, 
Wuhan, China) were hybridized to the seaions at 48'C 
overnight. The postiiybridization wash with a high suin- 
gency was performed sequentially at 37* C in 2 x standard 
saline citrate (SSC) for 10 min, in 0.5 x SSC for 15 min 
and in 0.2 k SSC for 30 min. Hie slides were then incu- 
bated to biotinylated mouse anti-digoxigenin antibody ai 
37.5'C for 1 h followed by washing in 1 x PBS for 20 min 
at room temperature, and then to strepavidin-peroxidase 
at 37.5 *•€ for 20 min followed by washing in 1 x PBS for 
15 min at room temperature. Subsequentiy, the slides 
were developed with diaminobenzidinc and tiien coun- 
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Incense/ ^ frequency 



Gleason score (M {%) 9 (%) 



2-4 5(83) J (17) 

5-7 19(79) 5 (21) 

^10 S(Z8) 13(72) 



Table 2: Correlation of PSCA expression with clinical stage 







Intensliy x frequency 




Tumor stage 


0^(%) 




9(%) 




27 (67.5) 

i(2S) 




13(32.5) 
6(75) 





terstained with hematoxylin to localize the hybridization 
signals. Sections hybridized with the sense control probes 
routinely did not show any specific hybridization signal 
above background. All slides were hybridized with PBS to 
subsdtute for the probes as a negative control. 

Scoring methods 

To determine the correlation between the results of P5CA. 
immunostaining and mRNA in situ hybridization, the 
same scoring manners are taken in the present study for 
PSCA protein staining by IHC and PSCA mRNA staining 
by ISH. Each slide was read and scored by two independ- 
endy experienced urological pathologists using Olympus 
BX-41 light microscopes. The evaluation was done in a 
blinded fashion. For each seaion, five areas of similar 
grade were analyzed semiquantitatively for the fraaion of 
cells staining. Fifty percent of specimens were randomly 
chosen and rescored to determine the degree of interob- 
server and intraobserver concordance. There was greater 
than 95% intra- and Interobserver agreement. 

The intensity of PSCA expression evaluated microscopi- 
cally was graded on a scale of 0 to 3+ with 3 being the 
highest expression observed (0, no staining; 1+, mildly 
intense; 2+, moderately intense; 3+, severely intense). The 
staining density was quantified as the percentage of cells 
staining positive for PSCA with the primary antibody or 
hybridization probe, as follows: 0 = no staining; 1 = posi- 
tive staining in <25% of the sample; 2 « positive staining 
in 25%-50% of the sample; 3 = positive staining in >50% 



of the sample. Intensity score (0 to 3+) was multiplied by 
the density score (0-3) to give an overall score of 0-9 
[1,5]. In this way, we were able to differentiate specimens 
that may have had focal areas of inaeased staining from 
those that had diffuse areas of increased staining (6). Ihe 
overall score for each specimen was then categorically 
assigned to one of the following groups: 0 score, negative 
expression; 1-2 scores, weak expression; 3-6 scores, mod- 
erate expression; 9 score, strong expression. 

Statistical anatysls 

Intensity and density of PSCA protein and mRNA expres- 
sion in BPH, PIN and Pea tissues were compared using the 
Chi-square and Student's c-test. Univariate associations 
between PSCA expression and Gleason score, clinical 
stage and progression to androgen- independence were 
calculated using Fisher's Exact Test. For all analyses, p < 
0.05 was considered statistically significant. 

Results 

PSCA expression In BPH 

In general, PSCA protein and mRNA were expressed 
weakly in individual samples of BPH. Some areas of 
prostate expressed weak levels (composite score 1-2), 
whereas other areas were completely negative (composite 
score 0). Four cases (20%) of BPH had moderate expres- 
sion of PSCA protein and mRNA (composite score 4-6) 
by IHC and ISH. In 2/20 (10%) BPH specimens, PSCA 
mRNA expression was moderate (composite score 3-6), 
but PSCA protein expression was weak (composite score 
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2] in one and negative (composite score 0) in the other. 
PSCA expression was localized to the basal and secretoiy 
epithelial cells, and prostatic stroma was almost negative 
staining for PSCA protein and mRNA in all cases 
examined. 

PSCA express/on in PIN 

In this study, we detected weak or negative expression of 
PSCA protein and mRKA scores) in 7 of 9 (77.8%) 
low grade PIN and in 2 of 11 (18,2%) HGPIN, and mod- 
erate expression (3-6 scores) in the rest 2 low grade PIN 
and 5 of 1 1 (45.5%) HGPIN. One HGPIN with moderate 
PSCA mRNA expression (6 score) was found weak suin- 
ing for PSCA protein (2 score) by IHC. Strong PSCA pro- 
tein and mRNA expression (9 score) were deteaed in the 
remaining 3 of 11 (27.3%) HGPIN. There was a statisti- 
cally significant difference of PSCA protein and mRNA 
expression levels observed between HGPIN and BPH (p < 
0.05), but no statistical difference reached between low 
grade PIN and BPH (p > 0.05). 

PSCA expression In Pea 

In order to determine if PSCA protein and mRNA can be 
detected in prostate cancers and if PSCA expression levels 
are increased in malignant compared with benign glands. 
Forty-eight paraffin-embedded Pea specimens were ana- 
lysed by IHC and ISH. It was shown that 19 of 48 (39.6%) 
Pea samples stained very strongly for PSCA protein and 
mRNA with a score of 9 and another 21 (43.8%) speci* 
mens displayed moderate staining with scores of 4-6 (Fig- 
ure 1). In addition, 4 specimens with moderate to strong 
PSCA mRNA expression (scores of 4-9) had weak protein 
staining (a score of 2) by IHC analyses. Overall, Pea 
expressed a significantly higher level of PSCA protein and 
mRNA than any other specimen category in this study (p 
< 0.05, compared with BPH and PIN respeaively). The 
result demonstrates that PSCA protein and mRNA are 
overexpressed by a majority of human Pea. 

Comtatlon of PSCA expression with Gieason score in Pea 

Using the semi-quantitative scoring method as described 
in Materials and Methods, we compared the expression 
level of PSCA protein and mRNA with Gieason grade of 
Pea, as shown in Table 1. Prostate adenocarcinomas were 
graded by Gieason score as 2-4 scores « well-differentia- 
lion, 5-7 scores = moderate-differentiation and 8-10 
scores = poor-differentiation [7]. Seventy-two percent of 
Gieason scores 8-10 prostate cancers had very strong 
staining of PSCA compared to 21% with Gieason scores 
5-7 and 17% with 2-4 respectively, demonstrating that 
poorly differentiated Pea had significantly stronger 
expression of PSCA protein and mRNA than moderately 
and well differentiated tumors (p < 0.05). As depicted in 
Figure 1, IHC and ISH analyses showed that PSCA protein 
and mRNA expression in several cases of poorly differen- 



tiated Pea were particularly prominent, with more intense 
and uniform staining. The results indicate that PSCA 
expression increases significantly with higher tumor grade 
in human Pea. 

Correlation of PSCA expression with clinical stage in Pea 
With regards to PSCA expression in every stage of Pea, we 
showed the results in Table 2. Seventy-five percent of 
locally advanced and node positive cancers (i.e. C-D 
stages) expressed statistically high levels of PSCA versus 
32.5% that were organ confined (i.e. A-fi stages) (p < 
0.05). The data demonstrate that PSCA expression 
increases significantly with advanced tumor stage in 
human Pea. 

Corre/otion of PSCA expression with androgen^ 
independent progres^on of Pea 

All 9 specimens of androgen-independent prostate can- 
cers stained positive for PSCA protein and mRNA. Eight 
specimens were obtained from patients managed prior to 
androgen ablation therapy. Seven of eight (87,5%) of 
these androgen-independent prostate cancers were in the 
strongest staining category (score = 9), compared with 
three out of eight (37.5%) of patients with androgen- 
dependent cancers (p < 0.05). The results demonstrate 
that PSCA expression increases significantly with progres- 
sion to androgen-independence of human Pea. 

It is evident from the results above that within a majority 
of human prostate cancers the level of PSCA protein and 
mRNA expression correlates significantly with inaeasing 
grade, worsening stage and progression to androgen-inde- 
pendence. 

Correlation of PSCA imntunostainlng and mRNA in situ 
hybridization 

In all 88 specimens surveyed herein, we compared the 
results of PSCA IHC staining with mRNA ISH analysis. 
Positive staining areas and its intensity and density scores 
evaluated by IHC were identical to diose seen by ISH in 79 
of 88 (89.8%) specimens (18/20 BPH, 19/20 PIN and 42/ 
48 Pea respectively). Importantly, 27/27 samples with 
PSCA mRNA composite scores of 0-2, 32/36 samples 
with scores of 3-6 and 22/24 samples with a score of 9 
also had PSCA protein expression scores of 0-2, 3-6 and 
9 respectively. However, in 5 samples with PSCA mRNA 
overall scores of 3-6 and in 2 with scores of 9 there were 
less or negative PSCA protein expression (i.e. scores of 0- 
4), suggesting that this may reflect posttranscriptional 
modification of PSCA or that the epitopes recognized by 
PSCA mAb may be obscured in some cancers. The data 
demonstrate that the results of PSCA immunostaining 
were consistent with those of mRNA ISH analysis, show- 
ing a high degree of correlation between PSCA protein 
and mRNA expression. 
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Figure t 

Representatives of PSCA IHC and ISH staining in Pea (A. IHC staining^ B. ISH staining, x200 magnification). A,. B,: negative con- 
trol of IHC and ISH. PBS replacing the primary antibody (A,) and hybridization v/ith a sense PSCA probe (B,) showed no back- 
ground staining. Aj. B^: a moderately differentiated Pea (Gleason score = 3+3 = 6) with moderate staining (composite score = 
6) in ail malignant cells; Aj: IHC shows not only cell surface but also apparent cytoplasmic staining of PSO\ protein. Aj. 83: a 
poorly differentiated Pea (Gleason score = 4+4 = 8) with very strong staining (composite score = 9) In all malignant cells. 
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Discussion 

PSC\ is homologous to a group of cell surface proteins 
thai mark the earliest phase of hematopoietic develop- 
ment. PSCA mRNA expression is prostate-spedfic in nor- 
mal male tissues and is highly up-regulated in both 
androgen-dependem and-independent Pea xenografts 
(IAPC-4 tumors). We hypothesize that PSCA may play a 
role in Pea tumorigencsis and progression, and may serve 
as a target for Pea diagnosis and treatment. In this study, 
IHC and ISH showed that in general there v/erc weak or 
absent PSCA protein and mRNA expression in BPH and 
low grade PIN tissues. However, PSCA protein and mRNA 
are widely expressed in HGPIN, the putative precursor of 
invasive Pea, suggesting that up-regulation of PSCA is an 
eady event in prostate carcinogenesis. Recently, Reiter RE 
etal [1], using ISH analysis, reported that 97 of 118 (82%) 
HGPIN specimens stained strongly positive for PSCA 
mRNA. A very similar finding was seen on mouse PSCA 
(mPSCA) expression in mouse HGPIN tissues by Tran C. 
P et al [8]. These data suggest that PSCA may be a new 
maricer associated with transformation of prostate cells 
and tumorigenesis. 

We observed that PSCA protein and mRNA are highly 
expressed in a lai^e percentage of human prostate cancers, 
including advanced, poorly differentiated, androgen- 
indq>endent and metastatic cases. Fluorescence-activated 
cell sorting and confocal/ immunofluorescent studies 
demonstrated cell surface expression of PSCA protein in 
Pea cells [91. Our IHC expression analysis of PSCA shows 
not only cell surface but also apparent cytoplasmic stain- 
ing of PSCA protein in Pea specimens (Figure 1). One pos- 
sible explanation for this is that anti-PSCA antibody can 
recognize PSCA peptide precursors that reside in the cyto- 
plasm. Also, it is possible that the positive staining that 
appears in the cytoplasm is aaually from the overlying 
cell membrane [5]. These data seem to indicate that PSCA 
is a novel cell surface marker for human Pea. 

Our results show that elevated level of PSCA expression 
correlates with high grade (i.e. poor differentiation), 
increased tumor stage and progression to androgen-inde- 
pendence of Pea. These findings support the original IHC 
analyses by Gu Z et al (9], who reported that PSCA protein 
expressed in 94% of primary Pea and the intensity of 
PSCA protein expression increased with tumor grade, 
stage and progression to androgen-independence. Our 
results also collaborate the recent work of Han KR et al 
1 10], in which the significant association between high 
PSCA expression and adverse prognostic features such as 
high Gleason score, seminal vesicle invasion and capsular 
involvement in Pea was found. It is suggested that PSCA 
overexpression may be an adverse predictor for recur- 
rence, clinical progression or survival of Pea. Hara H et al 
|1 1] used RT-PCR detection of PSA, PSMA and PSCA in 1 



ml of peripheral blood to evaluate Pea patients with poor 
prognosis. The results showed that among 58 PCa 
patients, each PGR indicated the prognostic value in the 
hierarchy of PSCA>PSA>PSMA RT-PCR, and extraprostatic 
cases wi^ positive PSCA PGR indicated lower disease-pro- 
gression-free survival than those with negative PSCA PGR, 
demonstrating that PSCA can be used as a prognostic fac- 
tor. Dubey P et al |12j reported that elevated numbers of 
PSCA + cells correlate positively with the onset and devel- 
opment of prostate carcinoma over a long time span in 
the prostates of the TRAMP and PTEN -i-/- models com- 
pared with its normal prostates. Taken together with our 
present findings, in which PSCA is overexpressed from 
HGPIN to almost frank carcinoma, it is reasonable and 
possible to use increased PSCA expression level or 
increased numbers of PSCA-positive cells in the prostate 
samples as a prognostic marker to predia the potential 
onset of this cancer. These data raise the possibility that 
PSCA may have diagnostic utility or dinica! prognostic 
value in human Pea. 

The cause of PSCA overexpression in Pea is not known. 
One possible mechanism is that it may result from PSCA 
gene amplification. In humans, PSCA is located on chro- 
mosome 8q24.2 (1], which is often amplified in meta- 
static and recurrent Pea and considered to indicate a poor 
prognosis [13-15]. Interestingly, PSCA is in close proxim- 
ity to the c-myc oncogene, which is amplified in >20% of 
recurrent and metastatic prostate cancers [16,17]. Reiter 
RE et al 1 1 8 1 reported that PSCA and MYC gene copy num- 
bers were co-amplified in 25% of tumors (five out of 
twenty), demonstrating that PSCA overexpression is asso- 
ciated with PSCA and MYC eoamplifieation in Pea. Gu Z 
et a! [9] recently reporteted that in 102 specimens availa- 
ble to compare the results of PSCA immunostaining with 
their previous mRNA ISH analysis, 92 (90.2%) had iden- 
tically positive areas of PSCA protein and mRNA expres- 
sion. Taken together with our findings, in which we 
detected moderate to strong expression of PSCA protein 
and mRNA in 34 of 40 (85%) Pea specimens examined 
simultaneously by IHC and ISH analyses, it is demon- 
strated that PSCA protein and mRNA overexpressed in 
human Pea, and that the increased protein level of PSCA 
was resulted from the upregulated transcription of its 
mRNA. 

At present, the regulation mechanisms of human PSCA 
expression and its biological function are yet to be eluci- 
dated. PSCA expression may be regulated by multiple fac- 
tors 118], WatabeTet al (19) reported that transcriptional 
control is a major component regulating PSCA expression 
levels. In addition, induction of PSCA expression may be 
regulated or mediated through cell-cell contact and pro- 
tein kinase C (PKC) |20]. Homologues of PSCA have 
diverse activities, and have themselves been involved in 
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carcinogenesis. Signalling through SCA-2 has been dem- 
onstrated to prevent apoptosis in immature thymocytes 
[21]. Thy-1 is involved in T cell activation and transducis 
signals through src-like tyrosine kinases [22]. Ly-6 genes 
have been implicated both in tumorigenesis and in cell- 
cell adhesion [23-25] » Cell-cell or cell-matrix interaction is 
critical for local tumor growth and spread to distal sites. 
From its restricted expression in basal cells of normal 
prostate and its homology to SCA-2, PSCA may play a role 
in stem/progenitor cell function, such as self-renewal (i.e. 
anti-apoptosis) and/or proliferation [1|. Taken together 
with the results in the present study, we speculate that 
PSCA may play a role in tumorigenesis and dinicai pro- 
gression of Pea through affecting cell transformation and 
proliferation. From our results, it is also suggested that 
PSCA as a new cell surface antigen may have a number of 
potential uses in the diagnosis, therapy and dinicai prog- 
nosis of human Pea. PSCA overexpression in prostate 
biopsies could be used to identify patients at high risk to 
develop recurrent or metastatic disease, and to discrimi- 
nate cancers from normal glands in prostatectomy sam- 
ples. Similarly, the deteaion of PSCA-overexpressing cells 
in bone marrow or peripheral blood may identify and pre- 
dia metastatic progression belter than current assays, 
which identify only PSA-positive or PSMA-positive pros- 
tate cells. 

In summary, we have shown in this study that PSCA pro- 
tein and mRNA are maintained in expression from 
HGPIN through all stages of Pea in a majority of cases, 
which may be assodated with prostate carcinogenesis and 
correlate positively with high tumor grade (poor cell dif- 
ferentiation), advanced stage and androgen-independent 
progression. PSCA protdn overexpression is due to the 
up reflation of its mRNA transcripdon. The results sug- 
gest that PSCA may be a promising molecular marker for 
the clinical prognosis of human Pea and a valuable target 
for diagnosis and therapy of this tumor. 
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The fundamental prInclpJe off molecular therapeutics In can- 
cetr Is to ei^lolt the diffen^fices In gene expiosslcm between 
cancer cells and nonmal cells. VWih the advent of cDMA anBy 
technology, most efforts have concentrated on identifying 
differences In gene exprasefon at the level off mFlNA, which 
can be attributable either to DMA ampliflcatfcn or to dlfffer- 
ences In transcription. Gene expression is quite complicated, 
iiowever, and is a!so regulated at the level off mRI^ stability, 
mRNA translation, and protein stability. • 

Ihe power of translational regulation has been best recog- 
nized anr)ong developmental blol(^lst8, because transcription 
does not occur In end»yogenesls in eukaiyDte& l=6r Goc- 
ample. In ASanopus, the period of tmnscriptional quiescence 
continues untS the embryo leach^ midblastula tHisl&Hi, the 
4000-ce« staga Therefore, alt necessary mRNA molecules are 
transcribed durfrig oogenesis and stod<pIled in a tnansla8onally 
Inactive, maslced fomn. The mRNA are IransfatlonaDy activated 
at appropriate times during oocyte matuialiai, fertilization, and 



eariy embjyogen^ and thus, are under strict transJalloneiil 
control. 

Tran^on has cui establ^ed rote In cell growth. BasB- 
caiiy, an Increase ;n pmt^n synthesis occtirs as a conse- 
quence of mitogenesls* lOntii recently, however* littte was 
imwn about the alterations in nriRhIA translaU^ 
aKi much la yet to ba jdiscovmd about th^ role in tht© 
development and progression off cancesr. Here we review th© 
basic principles o(f tiransSationa! contfpl, the alterations 
countored In cancer, and ^Sected therapies teageting translBw 
tion Ntiaebh to duddate potent^ new therapeutic avenues^ 




IV^islaSon Niialton is ttie ^ 

Translation Mtfedfon a coTT^l^ proce^ In wfM the initio 
tRNA and the 4(fi and ^ rSnsomal subunits 670 reoTU&ed to 
the 5' end o^ a mS^vSA nftoEecutd and GSsernbSed b^ ei^^ 
tran^atfon Ir^tlaito &cto ^ 

codcn the mim (F^ 1J)l The 5' end of euS^syoSfc mF^ 
c^)psd, £a. cont^ the c^ structure m^Gppp^S (rHiie%i{» 
guanostn&^lriphospS^'HlbbnucleoGide). translatlbn In 
eu^^uyotes oo(%ir5 In a cap^fependent fBshlon, Aft, to 
specifica% reix^rd^ by the dF4i^* wKch hshds the 5' cap. 
The elfW transteBon InRlafiOT oompIeK b then fom^ 
a^nWy of elF4e,.fhe RNA heOcass'eSF4A, and dF4Q, a 
scafibldir^ prafidn tint mediates the bM^ cf the 40S rft^ 
somal subunit to tfia mim mdecule through inteiactiOT 
the elF3 protdn pres^ on the 40S ril»»>nia elF4A and eF48 
partic^^ fni mefiir^ the noonday sInjECture off the 5' 
the mRf^ The 43S Ir^on complex (408/elF2flWl^-tnNA/ 
OTP ocxv^ s»ns the mre^ In a 5'^' dbecHon untB It 
enoountOT an AUG start ccdon. This start ccdcwi is then base- 
paired to the antlcodon of Initiator tRNA, ibmilng the 48S Initi- 
ation complex. The Infflatton factors are then dlspiac«l from the 
48S complex, and the 60S ribosome Joins to fbrni the 80$ 
ribosoma 

Unlike most eukaryotic translaBon, translation Initiation off 
certain mR^8As, such as the picomavlrus RiviA, Is cap Inde- 
pendent and occurs by internal ribosome entry. TWs mecha- 
nism does not reqi^ eiRE Other tfie 43S cwriptex ca^ 
the IniliaMon codon dbecHy through InteiBction with the IRES in 
the 5 ' UTR such as in the encephalomy ocaniltis virus, or it can 
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^ The e bbrevfattons used am: elF4E, eukaiyotte mftlaSon feolor 4^ um 
unttanslated regfon; WES, Wcmal ribosome entry slt^ 4E-BP1 . euteryotfc 
InfttaBon fector 4E-ttm2ng pntc&i 1 ; S8K rfeosomal pTO S8 Hbiass; mTOn, 

pnoephaBdyftw^ S-Wnase; PTEM, phosphatase and tensin horratog 
leted Ihm cimmt08oiR8 PP2A, protein pto^))^^ 



^ ^ TianstttontoHialtenhCBncej' 





Release of dF4B 



Binding of elF4P complex to 
mRNAS'cap 




UnM^dlngofseoaiidaiy 
anafiHiiiatioaof48S 



release of imtiaSion 
SOS 



MMAA 



^ '^^S'**^'™' WOatton In eukaiyotas. Tho 4E-BP8 era hypcfDhos- 
to n9lea8eeff4E so th^ 

^f^^^ »2f^bassomble<ltliohte?BctfcwofpoS^ 

mo 4re libosomal subuntt Is bound w elF3» and the tomaiy comotex 

raloased. and me la!9e ribosomal eubuRft to recnrit^ 



scanning or transfer, as Is the case wfth the poliovnus (1). 

Stimulation of Tramtefifcwj mmanon 
TranslQfion InWatfon can be regulated by afterations In the 
expression or phosphorylation status of the various fectora 
Ihvpfved. Key components In translational relation that 
may provide potentia! therapeutic targets follow. 

elP4E eiF4E plays a central role In translation regulation. 
It Is the least abundant of the inltla^on factors and Is con- 
sidered the rate-llmftlng component for Initiation of cap« 
dependent translation. elF4E may also be Involved In mRNA 
splicing, nriRNA 3' processing, and mRNA nucfeocytopfas- 
mic transport (2). elF4E expression can be Increased at the 
transcriptional level In response to senim or growth fectors 
(3), elF4E overexpresslon may cause prefenentteJ translation 
of mRNAs con^nlng excessive secondary structuio In theft- 
s' UTR that are nonmally discriminated against by the trans- 



tetional machinery and thus ans ferieffldently translated 
As examples of this, ov^sxjM^sslon of elF4e pjomotes linj- 
creased translation of vascular endothelial growrth tedoir 
flbrt)4>laslgrbvirthfecto 

Another medianism of control is the regulation of eIF4I5 
phosphoiylatton. e!IF4E |*osphorylatlon ts mediated by th(s 
fftftegen-acHvated protein kbiase-lnteractlng kinase 1, whtelh 
Is activated by the mMogen-acHva^ed pathw^ actlvatlmsi 
extracellular signal-related kinases and ^e stress-octivatecfl 
pathway acting through p38 rrytogen-actlvated protein EcD* 
nase (10-13). Several mitogens, such as senim, plaleteJt- 
derived growth factor, epMermal growth fiador, Insullnj^ 
^glotensln II, src kirmse overexpresslon, and ms over- 
expresston, lead to e!F4E phosphorylation <14). The phos- 
phorylation stato off elF4E ts usually comaterted wfth the 
transWond rate and growth status of thd ceil; however* 
elF4E phosphwylatlon has also been observed In respons® 
to some ceUuSar stresses when translatlonai rates adualBy 
decrease (15). Thus, further study needed to underslancQ 
the effects of elRE phospltoyteMon on <qW^ scitfvKjf. 

Another mechanism off regulation Is the alteratton of elF4E 
avallaWnty by the binding of eIRE to the elF4E4)Indlng prow 
telns {4&BP, ateo known as PHA&-I). 4e-BPs corrqieto 
elF4Q for a bbtcflng ^ In e5RE Tli© blndli^ of elF4E to the 
best characterized eiF4&blnd[ng protein, 4MP1I, b regtiK 
latted by 4E-BP1 plhosphoTytetfoa Hypophosphorylatod 4E- 
BP1 binds to oIF4^ wheTcas 4E-BP1 hypoplTosptHwylaaon 
decreases this binding. Insulin, angiotensin, epidermail 
growth factor, platdel-dertved growth factor, hepatccyt® 
growtti factor, nerve growth faster, Insulln-lik© gjowth factors 
I and 11, InterteuWn 3, ^ranutocyte-macrophc^ odony'-sttiT^ 
uIatingfactor + steel ftwior, gastrin, and the adenovfrM 
all been reported to Induce phosphorylatton of 4BBP1 and 
to decrease the abaity of 4E-BP1 to bind elF4E (15, 16). 
Conversely* d^^rfvatlon of nutrients or growth factors results 
In 4E-BP1 dephosphorylaBon, an bicnsase In elF4E bindfrig, 
and a decrease In cap-dependent translation. 

lp70S®KIIffias©. Phosphorylation ofribosomal 408 protein 
S0 by S6K Is thought to play an Important rolein tiBnslallonal 
regulation. S6K mouse einbryonfo celts prolif^rato moro 
slowly than do parental oells, demonstrating that 86K has a 
positive Influence on cefl prollfertrtfon (17). S6K relates the 
translation of a group off mRNAs possessing a 5' terminal 
oIlgopyftTfikfine tract {5' TOf^ found at the 5 ' 
protein mRNAs and other mRfviAs coding tor components of 
the translaltonal machlnwy. RK)^)horyl^ of S6K b reguK 
laled In part based on the availability off nirtrfents (^^ 
sBmuteled by several growth fect<»s, such as f^et^fcrived 
growth factor and InsuHn-Rka growth factor I C20). 

eOIFaoz PSiospfmryilalitesii^ TTie binding of the Initiator tRNA 
to the sm^l ribosomal unit Is mediated by translation inltic^ 
tlon factcM- elF2. Phosphoryfatfon of the o-subuntt of ell^ 
prevents formation of the elF2/QTP/Met-tRMA complax and 
Inhibits global proti^ synthtefe (21, 22). elFSa is phospho- 
rylated under a variety of conditions, such as viral Infection, 
nutrient deprivation, hone deprtvatton, and apoptosis (2^! 
eIF2a is phosphorylaled by hemen^ated Inhibitor, nutrient- 
ululated protein kinase, and the IFfsWnduced, double- 
alranded RNA-acfivated protein kinase (PKR; Ref. 2^. 



TllT©finiTr@!^^g[?iaIlia@ Mogsj;, Tho macrofldoantlbtolte 
rapamydn {SlraBniua; Wyeth-Ayesisl Research, CoHegevflle, 
RA) has been the subject of Intend stucdy bec^^ tft In* 
Wblte dgnal transductkMi pathways involved In T-oefl activci- 
tfon.Tli0rapamycln-sensfttve component off these palhvrays 
Is mTOR (also called FRAP or RAFTl), mTOR Is ths mam- 
malian homobgue of theyeas^TOR proteins that regulate Q, 
prpgnosslon and translation In response to nutriort availabil- 
ity (2^ mTOR Is a serine-threonine kinase that modulates 
translallon fnftlalton by altering the phosphoiylatton status of 
4E-BP1 and S6K (f=Jg. 2; Ref, 29. 

4£W1 isphosphojyIaKedonnfiumpfenBsldues; m^ 
phoiylates the Thr-37 and Thr-46 r^Wuse of 
^ houvsv^, phosphoryyion at th^ sites ts not asscdat^ 
vtftfi a toss of QlF4e P^tosptoylatoi o?7h^37 and 

■nMr-46 Is reqi^ed to subsecsuent pho^>hojytetot £d several 
CCX)H:tenninal, sBum-^i^fllva sites; a cornblnatkni of these 
phos(^K»ytetion events appear to ba needed to bihbit the 

todfrig of 4E-BP1 toelF4Ep^ The product o?1hei471W0eneb. 
p3C^ft^t pathway, am^ pTOt^ 

phosphojytalicm ^r-2S). 

S6K and 4E-BP1 are also regulated. In part, by PI3K and Its 
downstream protein WnasaAkL PTCW Is a phosphatase that 
negatively regulates PI3K slgnaHng. PTCN nuO cells hove 
consQtuth/eV active of AEd, with Increased S6K activity and 
S6 phosphorylation (30). SSK actMty Is WhMBd both by 
PI3K InhJbltOfS woUmaannln and LY294C02 and mTOR 
InWbllor rapamydn e^. Akt phosphorylates Ser-244S In 
mTDR In vitro, and this site Is ptosphos^ated upon Akt 
activatfon In vfw> Thi«5, mTOR Is regulated t>y the 

PI3K/Akt paUhvircQr; howev®-, this does riot appear to ba the 
only mode of regulation of mTOR activity. Whefter ttie PI3K 
pathw^ also regulates S6K and 4&BP1 phosphorylatbn 
Independent of mTOR b conSrovmiaL 

IntBTBstlngly, mTOR aub^dtosphceylatlon is blocted by vi/ort- 
rrarai&i but no^ by rapantycfn {3^ This ssamtr^ Iri^^ 
sugg^ that mTOR-r^ponslve regulatton off 4E-BP1 and S6K 
acthrfiy occurs tfinou^ a mechanism other than Intrinsic mTOR 
Idnase acth/ity.An alternate pathway ft»r4&6P1 and^fd^ 
phofylation by mTOR acth% Is by the Inhibition of a phospha- 
tasa Treatment wtth calyculln A, an Inhibitor of phosphatases 1 
and 2A, reduces rapamydn-lnduoed dephosphoiylalfon of 4E- 
BP1 and S6K by rapamydn (3^. PP2A IrterECls with fulHength 
S6K but not with a S6K mutant that Is resistant to dephospho- 
rylatton resulting from rapamydn. mTOR phosphcnylates PP2A 
in wto; however, how this process alters PP2A activity Is not 
kriown. lliese results are omistent with the model ^ phos- 
phorylation off a phosphatase by mTOR jMievems dephospho- 
rylallon of 4E-BP1 and S6K, and conversely, that nutrient dep- 

rivalkm and napanriydn block inhibition of the phosphatase ty^ 
mTOR. 

Polyodenytetflon. Tlie polyt^ tail In eukaryotic mRMA Is 
Important In enh^mdng translation Inftlatton and mRNA sta- 
bility. Pdyadenytetion plays a key rde in regulating gene 
axpresdon during oogenesis and early embryogenests. 
Some mRNA that are translattonaliy Inactive In the oocyte are 
polyadenylatod concomitantly wUh translatlonal acavatfon In 
oocyte maturation, whereas other mRNAs that are transla- 
tlonaiV active during oogenesis are deadenylated and trans* 




F^2. RegulaHon Of translatiQn Inmatlon 
waysL Signal^ via p38b fiKtrcceSuter eloi 
mTOncan aP esttvalo te^isSstliRd fatitlsttim. 



tnnsdiictlon path- 
Mzase, PISKi and 



latlonaily silenced {36-38). Thus, control of poIy{/s^ tafl sym- 
ttiesis Is an frnportant r^ufetoy step In g«i© exjHessloni. 
The 5' cap aiKi poiyt^ tafl are thmisW to turwtlon 
ticaliy to filiate mRisiA tensSatfonaS efflcte^ 

(SMAP^cCsasln^i, Most RNA-blndIng proteins ®e assem- 
bled on a transcript at the time of transcriptfoh, thus deter- 
mining me translatlonal fete of the transcript (41). A highly 
consented fsffnily of Y-box proteins Is found In cytoplasmic 
messenger ribonudeoprotein particles, whera tii© proteins 
are thought to piay a irole In restricting the mcnjitment of 
mRNA to the trandallonaJ machinery (41-43). The ma|09' 
mRNA-assodated protein, YB-1, destabilizes the Intemctfon 
of eiF4Eandtha 5' mRMA cap /n ^dHo, and overeExpresslon of 
YB-1 rissmts In translatlonal rapnssslbn in vivo (4^. Thus, 
alterations In RNA packaging can also piay an important role 
in trandatfonal regulation. 

TjranslaMon Allterailfloins Encouinteired In Canceir 
Three main alterations at the tnaislatlonal I evel occur In cancer 
variations In mRNA sequences tinat Increase or decrease tians-^ 
lattonal effldency, changes In the expression or awatlabHifiy 
components of tfie translatlonal machinery, and activation of 
translation through abenantty activated ^gnal transduction 
pathvvays. The first alterBtlon aRiects the trandati^ 
vldu^ ftrmfii that may play a rote In cardnc^wiesls. The sec- 
ond and thW afteraWons can lead to mors global changes, such 

as an increase In the overaH rate of jTOteIn synthesis and the 
tnanslational activation of several mRNA spades. 

Variations In mRNA sequence affect the translatlonal effi« 



and ^tamples of e^ mechanism follow. 

CUSuMoone. Mutetions in the mRNA sequence, e^^edaily 
In the 5' UTR, can alter its translatlonal efflidency, as seen In 
the following examples. 
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o-jnyix Saitoe^^propQSsadthattrwslalion 
o-m/o \Q reprsssedr whereas In sevml . BuiWtt lymphomas 
ftallwedeletlofisof ftemlWAS' OTR. tr^^ 
b more ef!k*m (4e). More recently, ft ^ 
S' \m of o-myc contains an ires, and «ius transla- 
tion can bo Inftfated by a cap-Independent as wall as a 
cap^ependent mechanism (46, 47). m patients with muH^ 
myeloma, a C-*T mutaHon In the o-myo n%8 was Identlfled 

flQ andfound to cause an enhanced Tnitiaticm of trens^on 
via fntemal r!t>ospmal entry (49). 

BffC4f. Asomatic point mutation (117 Q->q in posftlon 
--3 respect to the start codon of the BflGAr gene was 
Id^itffled In a highly ags^es^e sporadic breast cancer ^ 
Chlmerte constructs consisting of the wUd^type or mutated 
BBOA1 um and a downstream luclferase reporter demr 
MTistrBtedadecreaselnthetrBnslaliondi efli^ 
UTR mutation. 

CsniBnHleimdentianasobt^^ Some Inherited 
melanoma kindreds have a Q->T transverslon at base -34 
of cyc&vdependent kinase lnhlbJtDr-2A, whteh encodes a 
cycllnKi^)endent Idnase 4/cydin-dependent kinase 6 Mhase 
Inhibitor Important in checkpoint regulation (51). This 
mutatkMne.slves rise to a novel AUQ translation Init^n 
coddOi creating an upstream open reading irame tfiat com- 
petes for scanning rHbosomes and decreases tianslaaon 
torn the wfki-type AUa 

Aftemate Sf^ng and Alternate Transcription Start 
Sltea ABaatbis In spBdng and aftemate trBnscrfptk)n sites 
can lead lo variask}ris in 5' am sequence, length, and 860^ 
aiy stnicliffe. ultimate^ hnpacllr^ translallonal ef^^ 

Am Hie A711I gene has four nonooGBng axons &i Its 5' 
UTR th^ und^o extensive altem^ive ^Bdng p2). The 
contents of 12 diflerent 5' UTBs that show corslderabla 
divert In length and sequence have been identified. These 
dfvergem 5' leader sequences play an Important role In the 
tran^atkxiai reguiatkMi of the ATM gena 

imfta In a stibset of tumorsroverexpresston of the onco- 
protein mdm2 results in enhanced translation of the mt*n2 
mfm Use of dlfferert promoters leads to two mc//i72 tran- 
scnpts that differ only fn their 5' leaders (53). The longer 5' 
UTR contains two upstream open reading firames, and this 
mRNA Is toaded iMth ribosomes Inefficiently oompared with 
the short 5' UTR. 

BRGA1. In a normal mammary gland. BRCA1 mRNA is 
egqsressed with a shorter leader sequence ^'UTReO, whereas 
lnsporadicbreastcancertlssue,Bf?CAy mRNA is expressed 
with a tonger leader sequaice (5' UTRb); the translatk>nal 
eflfciencyoftranscrtpts containing 5' UTRb is 10 times lower 
than that of transcripts containing 5' UTRa p4). 

7iGF-^3. TGF-fi3 mRNA Includes a 1.1^ 5' UTR, whteh 
exerts an Inhibitoiy effect on translatton. Many human Iweast 
cancer oea nnes contain a novel 71SF-p3 trai^crfpt vwth a 6' 
UTR that Is 870 nucleotides shorter and has a 7-fbId greater 
translallcMial efficiency than the nomnal TGF-p3 mRNA 

Altmiate Polyadenylatlon Sites. Multiple polyadenyl- 
atlon signals leading to the generatton of several transcripts 
with dffiering 3' UTR have been described for several mRNA 
species, such as the RET proto-oncogene (5Q. ATM gene 
<P?), tissue inhibitor of metaIloproteInases-3 p7), «WQA 



ptotooncogene (5?), and caimoduIfrH p9). Allhoi^ th© 
effect of these aRemato 3' UTRs on translalton Is not y&t 
known, they m|^ be Inqjwtant fn RNA-proteln in^facUona 
thai affect tiansiatldnal recruitment The role of these alter-*- 
attons In cancer devetopment and irogresston Is imknouvrt. 

nansbthn MacNimy 

Alterations In the components of translatton machinery can 
take many forms. 

Overexpressskm of elF4E. Oveiexpresslon of elRE 
causes maBgnant transfonnalkm In rodent cells {B(9 a^ 
deregulation of HeU ceU groMt (ei). Pofunovsl^ etai {62> 

found that elf^E overexpressksn substthjtes for serum and 
Individual growtii factors In pnesen/Ing \dabllity of fifaroblasta. 

whtehsuggeststhatelP4Ecan mediate both piollfeiative and 
survival slgnaUng- 

Bevated levels of elF4E mRNA have been found In a bRMid 
spectnim of transftmned eel lines (63). elF4e levels are 
eiewated In all ductal cardnoma h shu specimens and Inva- 
sNe ductal canAiomas, compared with benign breast spec- 
imens evaluated with Western Wot analy^ (64, 6Q. Prellni- 

inary studies Sliest that this overeKpresskm Is aitrlbutabfe 
to gene amp&ncation (66). 

There aie accumulating data suggesting that elF4E oveioo- 
pre^w can be vaduable as a piognostkj maito. elF« 
eaqaes^ wasfbund h aietroepectivB sl^ 
poor prognosis In stE^ I to in breast carcinoma (67)l Verf^ 
to) (tf tt» proffiostte value of eli=4£ in br^ cancer is now 
underw^lnapro^)eciivetrial{67). However, fri a cfiffeiettl 
stuc^, e!F4E wpresskxi was combated wHh the digressive 
behavior of non4todgkin's lymiAomas In a prospective 
analysisofpaliehts with head and neck cancer, elevated levels 
cfelf=4Elnhistok)gica8ytumorwflreesw^ca] marghis predteted 
a slgnifk»nlV hcreased ifsk of k>cakegk3nal lecunence (B). 
These lesufts bSI suggest mat eIF4E overaxpresslon can be 
used to select patients who m^ benefit fixxn more aggresslw 
sy^einte thenapy. f=tJrthemiore, the head and neck can^ 

suggest that ell^ overexpressfan la a fleW deied and can be 
used to giAJe kical therapy. 

Alterations fai O&ier Initiation Factors. Alterations In a 
nuwto&r of other Initiation factors have been associated with 
cancer. Overproductton of eIF4Q, similar to elF4E, leads to 
malignant transfbnnallon !n vitro (69). elF-2a Is found In 
Increased levels In brondiloloahreolarcarrfnomas of the lung 
(3). Initiation factor elF-4A1 Is overexpressed b melanoma 
(7(B and hepatooeliular carcinoma (71). The p40 subunit of 
translation initiation factor 3 Is ampfified and ov^xpressed 
In breast and prostate cancer (72), and the elF3-p1 10 subunit 
Is overexpressed in testteular seminoma (73). The role that 
oversxpresslon of these Initiation factors plays on the devel- 
opment and progresston of cancer, if any, Is not known. 

Oveiwqpiesslon of 86IC S6K is ampllfied and highly 
overaxpressed In the MCF7 breast cancer cell line, com- 
pared virfth nonmal mammary epithelium (74). In a study by 
Bariund ef af. (74), S6K was amplified In 59 of 668 primary 
breast tumors, and a statlsHcally significant association was 
observed between ampllllcatton and poor prognosis. 



Ihe^ PAP isjovmqmsed En human canoes' cells com- 
pared wiflt iKMTfnal and >toliy teansffDTmed calfs (7^ PAP 
eno/matte Qcifv% in breast tuim 
PAP protein teveSs mwS, In marranajy tumor 
foundto bo an Indepandenlfactorto' predicting auvlvall (76). 
Uttle b loriowB^ howaver, atKM 
IMty affieds thatranslattonai profile. 

A!IiBa^02teteRM4)fedlfi^l^^ Even lass Is Cmown 
about altefBtois In RNA iMEdtagIng In cancer, ftrmasad bk- 
pn^skmaKj nudear locaiizatton cff RNA4Mng p?^^ 
YB-1 ara Mcatora o? a poo' pn^ncj^ for bji^ 
non-snfwB caB tag caite^ (?8X and owato 
ever, this ^fect may ba mediated fit least &i part 
iiansa^oa (because YB*1 Inoiraisee chwnoTBsSsianoe by en- 
hancing the tnansct^ptim of a mu!tk£fii^ si^^ 



AcOvalkm of a^ tnssisdudlon pattm^ Bsy losa of tumor 

suppessorsa^tescrctfaTtasq^^ 

can ooTlribute to the gnwimi and aggres^ert^ of tumore. ^ 

bnpo rtanj, mutant k\ huimn canoeafs te the tiEmcr aippresarr 

genePreV, whkA tesdstotha acavait^ 

wsy. Ac5^ffisiton of Pm and A!<E lr«Su»3s ^ 

fom^Jton 0? cfi^cfeCT emtoyo fiJat^^ 

show constitutive f^iosphoyte^ 

A mutant m thai r^ahs kinase 8dtf% txA dOas not phoa- 

phiM3flallB^Kor4&BP1 doasnolteotsfoymfi^^ 

auggesb a ooiretaSbm batwaen tha cmcc^m^k^ 

Akfcartdthapltospho?yteaonofS6Karea^4E-BP1 

Several t^^no Wnases such as platelet-dertved growth 
fector, feisulIn-liCte growm ftctor. HER2/hai, and epWemial 
growth factor nec^lor are ovenestpresaed In cancer. Be- 
cause these iklnasea acQvate downstream aignal transduc- 
tion pathvffisyskrtown to alter trans!atk»i ^ acUvation 
of translaSton Is llk^ to oontrfbute to the growth and aggr^« 
siveness of these tumors. FurUwmore, the mFtf^ for many 
of ttwsa Wnasee ttiemselves are under transIatfaMifid contnil. 
For example, HER2/neu mRNA Is translattonally controlled 
both by a short upstream open reading frame that represses 
HER2/'neu ttanslatton In a cefl typ&4ndependent manner and 
by a distinct cell type-dependent mechanism that increases 
translatlona! efficiency (82). HER2^eu translaHon Is different 
in transfonmed and normal cells. Thus, It Is pos^bte that 
alterations at the translallonal level can In part account for 
the discrepaicy betwreen hiBi2/mu gene amplllteatlon de- 
tected by ftuorescence/naflu hybrfdteatlon and protein levels 
detected tiy Immunohlstochemfcal assays. 



ToxMnslaaioini Tsingoto.off Saleclod ©amiceir TTThrnDgiy 
Components off the translation machinery and signal path- 
ways Involved Iri ^ acth/atlon of firansladkm Ihltiatld^ 
sent good targets fbr cancer therapy. 

7am^^m flft© mTQn S!snsilfoi§ Pa&msi^ Rammimln 

Rapamycin Inhibits the prbUferatlon of lymphocytes. It was 
Initially developed as an immunosuppreaalve dwg fbr organ 



toi^rfanllatfonL FtepaiVQ/t*^ \m R©P 112 (FKSOS-b&idbDSj 
protein. 12,000) binds to mTOR to Inhibit its ftmctlon. 

Ra?«uTiy<^Cfflisesa8nttdlbuleij^fffcamr^ IntDu® 
Initlatkm rate of protein ^mthesls (8^. It blocks cell gjrowth Q otj 
part by bloddng 88 phoophmytaHon and e^ectivety mp>^ 
pres^gthatean^atkmof 5'TOPmR^aAs, suchasribosom 
prot^nSp and elongation fiact<»B Rapamycin also 

blodcs 4E-BP1 pho^orylallcMi and inhibits caflxlependeirtt 
but not C8q>lndependent translation (17, 8^ 

The r^)anfiydn-G87£iUve s^jr^ tmnsducflon paHhwa^i adSk 
vated during maBgnanitransjicOTa^ 
Is now bi*ig studied as a tag^ fbr cancer fhetne^ 
tat^ bf^^ c^ hmg, gSdt^aston^ meteortomat ar^ 
leykevTila are amof^ the cancer Ones most sensitive to Hh© 
rapamycin srah^ CC8-779 pWi^^aJh^^ess^'R^e®^ Rsff. 
67>. In rhabdomyooCTccjia rapamycin b elt^ 

statte or c^todcM d^^^riding on tha p63 stedus (tf th^ 
vidld-^O8llstri8afi6dwi0irapairqff:barv^In pteae 
and maintain th^.vlal^, wheieas p^ iinutant oells aocurm^ 
tedslrtQf ajidundeiBOfi^j34a^(B8. 8S^. asisc©!!^ 
study leti^ ^mm pjWHvs neuroeotodsnml tunncr and 
mec^(ot)fastoma mo(teSs, rapamycin eodiibited mora cytotoK* 
Idly in conibfT^itbn dsf^atin and 

agent. &j OCa-77S defejTBd grcnw^ 
l^afterl of therapy and 240% aftBr2w68S9LAair^1!i8^ 
h^Ktoea adndnfeiralkm cai^ a 8796 dacreasa In tisnmr 
vduffca Growth br^fltdtlon h wto was i^ tfenae greater, ^fiMi) 
cisp!alinbicmbirelkmwlthOCI-77gth^ abn^ 
{SG). Thus, pTBcQnteai sSudi^ subtest tfiat rapamycin anah- 
togues m useftd as ^r^ stents and In combJnatoi wlQlh 
chemolhenapy. 

Rapamycin analogues CCI-779 and RAOODI (^ovartb, 
Basel, SwitzerfeoKil) are now In clinic^ 
known eHisca oJ reqramyc^ on lymphocyle pmllfeTHlionp a 
potential prcMem with rapomydn analogues Is [mmunosu] >> 
pr^skm. However, aithoii^ r^folonged ImmunosuppTessioini 
can result from rapamydn and CCI-779 administered on 
continuous-dose schedules, the immunosuppressive effects 
of rapamycin anak^es rasohfe In -^24 h after therapy 
(91). The prlnclps^ toddtles of Oa-779 have induded der^ 
matological toxicity, myelosuppnesston. infection, mucositis, 
dianiiea, reversible elevations In Uver function tests, hyper- 
glycemia, hypokalemia, hypocalcemia, and depression @7, 
92-94). Phase II trials of OCI-779 have been conducted irl 
advanced renal cell cardnon^ and In stage IlI/IV breast 
caroinoma patients who ^led with prior ch^mthenapy. In 
the results reported In abstract fomi, although there were r^ 
comptele re^nses, partel responses w«b documented h 
both renal cell carcinoma and in breast carcinoma P4, 85). 
Thus, 00-779 h&s documented preliminary dinteal activity In 
a preMously traa&ed, unselected patient populatton. 

Active Investigation la underway into patient selection n)r 
mTOR Inhibitors. Several studies have fbund an enhanced 
efficacy of CC!-7?9 In PTENHndl tumors pO, 96). Another 
sUjdy found tfwt sbt of dghl breast cancer cdi lines were 
responsive to CCI-779, although only two of these lines 
lacked PTEN @7) There was, however, a positive conelatton 
between Akt activation and CCI-778 sensitivity ^ This 
conrdlatkm si^gests that activation of the PI3K-Akt pathwcqr, 



^Ip^ Ttosfatton WBaBnn to Cancer 



r^8itfleS8 of whether ft Is attributabte to a FITBSS mutsiilon 07 
toowajcpresshBOtfrec©^ Wnases, mates caiih 
cer cell $merabte to mTDFWIrected ^ef«w- con^r^ 
[w«* levels of the fiargst of mTOa 4E-BP1, ara ossociafted 
with rapanydn resistance; «hU8, a tower 4&BP1/elF4E fatio 
x^Vt&SxSL rapamycfn resistance (98). 

Aiio&w mode of actMty to vapan^ 
appeara to bo ttvough MlbRkm of anglc^enesls. TWs acthf- 
fty may !» botti through direct InhlbWpn of endotheRal cell 
prollfefallon as a result of mTOR Inhibition In these cells <»- by 
InhlbHfon o? translatton of such proanglogenlc fectore as 
vascular endothelial powlh fector In tiOT^ 

The anglogenesSs Inhlbftor tumslatln. another anticancer 
drug cuirem^ unclM* study, also found recently to 
«ransIaB<»i in endothelial ceOs (101). Through a requl^ In- 
teraction) with Int^la tumstatfid Inhibits acttvatton o? the 
PBtWkt pathway and mTOR In endoaienal cells and pre- 
vents dlsscdaBon of eIF4E fifom 4E-BP1, thereby Inhlbftfti® 
cap-dependent translation. These finding suggest that en- 

dothetfal oeUs are especially sensitive to therapies taigeting 
the mTC^-eignalling pattmay. 



rnesrit ateo reduce ilhe ej^jTi^^ of ar^kq^ 

amJ h^ been pjqK>sed as a potential acfluvant tlhs(^ 

amS necJc cem»3S, p®tf catoty when dea^ 

aiaiBM inaigjns. anaa molecate 1^^ 

4&BP1-blndli^ domain of ^F4E are proapoptotte (116) ancfl 

are also beb^ acQveV purausd. 



Mxpimitng SeSscmiS^ intmskimffB to- @e?D© Yhismp^ 

A different thOTpautte appmach that talces advan^ 

erihanced cap-dependent tran^tlon In carwerc^ is the us^ 

Of gene th^apy vectnjs snoodlng siMte genes vwRh WghO^jf 

s&uct uredS'U TRlhesemWjAwMA^ 

disadvantage fri nomial oefis and noi trensSate wen, whs^ 

cancer th^ trarffiJato more efl^^ 

thaErtycducllonof thsS' UTRoffibTt&feBtgj^^ 

the ccdb^ seqiwnce otf /i»pas ^hptejf 

ftir^ g^ aBoift^ for ^leotive tren^atfon of /t^^ 

V!&us(ypa-Y tfi^Qn&Aiefttesegei^ In breast cancer ceQ Ones 

compared wMt no?maI R^mims^ ceS Cnss and refill In 

fec«tf® 88n^tly%to g8Rc6cIovLr(ll7)- 



EPAIs an n-3 polyunsaturalted IsXfy add found h tt& fish- 
based diets of populations havlr^ a low Im^Idence of cancer 
(102). Q»AInhlblta the proHfifflaMon of cancer cells (103), as 
wen as In animal models (104, 1 0^. it blodcs cell division by 
inhibittr^ tmn^on mKlatton (12^. EPA releases C^-** ftt)m 
Intracellular stores while inhibiting theSr.refilHnsb thereby ao- 
tfvaiir^ PKR. IPKR. Intum phosph'orylatesandiinh»Htse1]=2a, 
resulting in the Inhibition of pretefrt synthesis at flie level 
translatlbn Ir^tiatlon. Sfn^arty, ctotrimazole, a potent antlpr^ 
ifferative agent/h vftre and in vivo, Inhibits cell growth through 
depletion of Ca?"^ stores, actlvafion off PKR, and phospho- 
Byiatfen of elF2« (1C^ Consequently, clotrimazole prefePKv 
tlally decreases the expression of cycllns A, E, and D1, 
resulting bi blockage of the oefl cycle In 

mcte-^lsa novel tumor suppressor gene being develof^ 
as a gene therapy agent AdenovbaJ' transfer of mdia-7 (Ad- 
mdaT) biduces apoptosis In many cancer cells Including 
breast, colorectal, and lung cancer (1 07-1 09). Ad-mda7 also 
Induces and activates Pj<R, which leads to phosphorylation 
of elF2<K and induction of apoptosis (110)» 

Ravonokis such as genlstein and quercetin suppress tu- 
mor cefl gn>wth. AU three mammalian elF2a kinases, PKR, 
heme^egulated Inhibitor, and PERK/PEK, are activated by 
flavomrfds, with phosphorylation off etF2a and Inhibition of 
protein synthesis (111). 



asft<3 PapllSd!&3 ■ 

AntJsense expression of eiF4A decreases the preflferaHon rete 
ofmel£uiomaGdls012).Seque$tnationofeIi=4Ebyoverexp^ 
sto of 4E-BP1 Is preapoptotfc and decreases tumojtgentelty 
(113, 114). Reduction of elF4e antisense RNA decreases 
soft agar gpowlh, Increases tumor latency, and Increases the 
rates of turner doubling times (7). Anllsense elF4E RMA treat- 



TteiKtefito a (sudd pjoo^ in evesry ceiL 
afterafiions h tran^tioaf^ ocmtrcS occur k) caTxcer. Q 
a|^}^ to rtssd ari) abs^rantfiy activaled tran^aSionai s 
swvM, thus ^to«f&^ the tergrting 

si^pdslngly tow tG)dcity. Components of ttie tiansi^^ 
chfn^. su^ as eeF4& and ^nal transduction pathway fev 
volved IntransJationlrtitiatfon, sutSi mTOR, rgaessTit pw^ng 
tajgelsfor cancer thers^JnhlbftoTs of the mTOR hsMs a^ 
showri amie prellmlrrary activity In dhicaJ tr^ 
tiial vtfith tfw devebpn^ of better predtetfve martos and 
Ipeit^ pat!^ ^ecSton, response rates to ^ngto^igent therapy 
can be Improved. SInto to other <grto6tato ag^ 
mTOR InhR:^ are nrtost Bkeiy to acNe^ cOnlcalt utiity Sn 
conriblneiHon thaapy. In the Inteim, our Increasfe^ understE^ 
Ing of'trensiatton initiation and s^nal transduction pattiwsQ^ 

promise to to the identtficatton of new tiierapsutic taigets 
In ^ near future. 
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I, Avi Ashkenazi, Ph.D. declare and say as follows: - 

1 . I am Director and Staff Scientist at the Molecular Oncology Departmoit. of 
Genentech, Inc., South San Francisco, CA 94080. . 
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seaeted proteins ovw-ejipiessed in tumors, with the aim to identify useful targets for the 
development of tiirat^eutic antibodies for cancer treatment 
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4. Gene amplification is a process in which chromosomes undergo changes to 
contain multiple copies of certain genes that normally exist as a single copy, and is an iiiiportant 
fector in ^e pathophysiology of cancft. Amplification of certain genes (e.g., Myc or Hei2/Neu) 



gives cancer cells a growfii or survival advantage relative to normal cells, and might also provide 
a inechariism of timiorceU resistance to cheinotfa!d:apy^^ 
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Editorial; 

Editorial Boanl Member 
Associate Editor^ Ciinical Cancer Research. . 
Associate Editor, Cancer. Biology and Ther^y, 

Refereed papers: 

L Gertler, A,, Ashkenazi. A. and Madar, Z Binding sites for human growth 

hormone and ovine and bovine prolactins in the mammary gla^d and liver of the 

lactating cow. MoL Cell Endocrinol 5U57 (1984), 
2. Gertler, A., Shamay, A., Cohen, N.. Ashkenazi, A>, Friesen, H., Levanon, Aw, 

Gorecki, M., Aviv, R, Hadari, D., and Vogel, T, Inhibition of lactogenic 
. : activities of oviae prolactin and human growth hormone (hGH) by a novel form of 

a modified recombinant hOH. Endocrinology 118, 720-726 (1986). 
3/ Ashkenazi, A.. Madar, Z>> and Gertler, A. Partial purification and characterization 
. of bovine mammary gland prolactin rieceptor. Mol Cell Endocrinol 50, 79-S7 

(1987). 

4. Ashkenazi, A.. Pines, M., and Gertler, A. Down-regulation of lactogenic 
hormone re(^tors in Nb2 lymphoitna cells by chol^ toxin. ^^^^ 
//ircrnari 14, 1065-1072 (1987X 

5. Ashkenazi> A., Cohens R.^ and Gertler, A. Charactmzation of lactogen receptors 
in lactogenic hormone-dependent and independent Nb2 lymphoma cell lines. 
i?EB5£e^ 210, 51-55 (1987). 

6. Ashkenazi. A;. Vogel, T., Barash, L, Hadari, D., Levanon, A., Gorecki^ M., and 
Gertler^ A. Comparative study on^^k4tro and icndvolnddulation of lactogenic 
and somatotropic receptors by native human growth hormone and its modified 
recombinant analog. Endocrinology 121, 414-419 (1987). 

7. Peralta, E., Winslow, L, Peterson, G;, Smith, D., Ashkenazi, A.. Ramachandraii, 
J., Schimerlik, M., and Capbn, P. Primary structure and biochemical properties 
of an M2 muscarinic receptor. iJc/e/ice 236, 600-605 (1987). 

8. Peralta, E. Ashkenazi> A.. Winslow, J., Smith, D., Ramachandran, J., and Capon, 
D. J. Distincnt primary structures, ligand-bmding properties and tissue-specific 
expression of four hunian muscarinic acetylcholine receptors.. EMBO J. 6, 3923- 
3929(1987). 

9. Ashkenazi, A.. Winslow, J., Peralta, E., Peterson, G., Schimeriik, M., Capon, D., 
and Ramachandran, J, An M2 muscarinic receptor subtype coupled to both 
adenylyl cyclase and pKbsphoinositide turnover. Science 238, 672-675 (1987), 
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10. Pines, M., Ashkenazi. A.. Cohen-Chapnik, N., Binder, L., and G^ler, A- 
InUbition of the proliferation of 

concentrations of cholera toxin and partial reversal of tii6 effect Sbyl2*0r 
telradecanoyl-phorboH3-acetate, 1 Cell Biocheni. 37, 1 19-129 (1988). 
llv Peralta. B, Ashkenazi. A.> Winslow. J. Ramachandran, J.3 and rapnti n 
Differential regulation of PI hydrolysis and adenylyl cyclase by muscarinic 
rec^tor subtypes. Mzmre 334, 434-437 (i98i8). 

12. Ashkenazi>. A. Peralta, E., Winslow, L, Ramachandran, J., and Capon, D. 
Functionally distinct G proteins couple different receptors to PI hydrolysis in the . . 

. saine celLC^//56, 487-493 (1989), 

13. AshkenazL A,, Ramachandran, L, and Capon, D, Acetjdchbline analogue 
stimulates DNA synthesis in brain-derived cells via specific muscarinic 
sicetylcholme receptor subtypes. Nature 340, 146-150 (1989). 

14. Laminaie, D., Ashkenazi. A., Fleuiy, S., Smith, D.,jSekaly, R., and Capon, D. 
The MHC-binding and ia>12a-binding domains of CD4 are distinct and sq)arable, 
&ie«cc 245, 743-745 (1989). 

15. Ashkenazi:, A.. Prestai L., Marsters, S.,. Camerato, T., Rosentiial, K.^ Fendly, B., 
and Capon, D. Ma|>ping the CD4 bmding site for human immunodeffipiency 
virus type 1 by alanme-scanning mutagCTesis. Proc, Natl Acad. Set USA. 87, 
7150-7154 (1990). 

16. Chamow, S., Peers, D,, Bym, R,, Mulkeirin, M., Harris, R., Wang, W., Bjorkman, 
P., Capon, D., and Ashkenazi. A. Enzymatic cleavage of a GD4 immunoadhesin 
generates ciystallizable, biologicaUy active Fd-lkefiragmm^ 

■ - 9885-9891(1990)/ ' ^^ ^.^ ... ^ _^ 

17. Ashkenazi, A.. Smitii, D., Marsters, S., Riddle, L., Gregory, T., Ho, D., and - 
Capon, p. Resistance of primary isolates of human inmnmodefficiency virus type 
1 to soluble CD4 is indq)endent of CD4-rgp 120 binding affinity. Proc Natl 
Acad ScL USA. S^, 70564060 (1991). 

18. AshkenazL A.. Marsters, S., Capon, D., Chamow, S., Figari., L, Pennica, D.^ 
GoeddeL, D.,.PaUadino, M., and Smith, D. Protection against endotoxic shock by 
a tumor necrosis factor receptor immunoadhesin. Proa Natl Acad, ScL USA. 88, 
10535-105S9(1991). 

19. Moore, J.,McKeating. J.. Huang. Y.. Ashkenazi, A .. andHo, D: Vkionsof 
primaiy HIV-1 isolates resistant to sCD4 neutralization differ in sCD4 affinity and 
glycoprotem gpl20 retention from sCD4-sensitive isolates. J. Virol 66, 235-243 
(1992).. 
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20. Jin, H., Oksenberg, D., Asiikenazi. A.. Peroiifka, S., Duncali, A., Rdzmahel, R., 
Yang, Y., Mengo^ Q., Palados, J., mi QTDowd, B. . Characioization oif flie 
human 5-hydroxytiyptaniineife recqjtor. J. Biol. Chem. 267, 5735-5738 (1992). . 

21. Marsters, A., FrutkiA A., Simpspn, N;, Feadly, B. and Ashkebazi. A. . 
Identification of cysteine-rich domains of the type 1 tumor necrosis lecqptor 
involved in Ugand binding. J. CAein. 267, 5747-5750 (1992). 

22: Chamow, S., Kogan, T., Peers, D., Hastings, R., Bym, R., and Ashkenazi. A 

Conjugation of sGD4 wifliout loss of biological activity via a novel daibohydiate- 
directed cross-linking reagent J. BioL Chem. 267, 15916-15922 (1992). 

23. Oksenberg, D., Marsters, A., OT)owd, B., Jin, H., Haylik, S., Peroutka, S., and 
Ashkenazi. A. A single amino-acid difference confers major pharmacologic 
variation between human and rodent 5-HTib receptors. Nature 360, 161-163 . 

(1992) . . 

24.. HMc-Rrendscho, M.,.Marstrars, S., Chamow, S., Peers, p., Simpson, N;, and 
AshkenazL A. &ihibitionof interferon y by an int«:fox>ny receptor 
inmiunoadhesin.'/nuRtiRo/og)'. 79, 594-599(1993). 

25. Pbnica, D., Lam, V., Webra, R., Kohr, W., Basa, L., Speibnan, M., Ashkenazi. 
Shire, S., and Gbeddel, p. Biochonical diaractacization of the ex^^ 
domain of the 75-kd tumor heraosis factw receptor. Biochemistry 32, 3131-3 1 38. 

(1993) . 

26. Barfod,L., Zheng, v., Kuang,W., Hart, M., Evans, T,,Cerione,R., and 
Ashkenazi. A. Cloning and expression of a human CDC42 .GTPaseActivatmg 
Protein reveals.a flmctionid SH3-bindmg domain, y. fidZ: C%em. 268, 26059- 

26062(1993)? • . ■. .1 

27. Chamow, S., Zhang, D., Tan, X., Mhtre, S., Marsters, S., Peers, D., Bym, R., 
Ashkenazi. A., and Yunghans, R. A humanized bispecific immunoadhesin- 
antibody that retargets CD3+ effectors to kill HIV-1 -infected cells. J. Immunol 
153,4268-4280(1994). 

28. Means, R., Krantz, S., Luna, J., Marsters, S., and Ashkenazi. A. Inhibitidn of 
murine erythroid colony foimiation in vitro by iterferon Y and. correction by 
interferon y receptor immunoadhesin.Bfoorf 83-, 911-915 (1994). 

29. Hsak-F!rendscho, M;, Marsters, S., Moidenti, L, Gillet, N., Chen, S., 
an dAshkenazi. A. Inhibition of TNF by a TNF receptor immunoadhesiii: 
comparison with an anti-TNF mAb. /. Immunol 152, 1347-1353 (1994). 



30. Chamow, S., Kogan, Venuti, M., Gaddc, T., Peeis; p., Moidaiti, J;, Shal^ S., 
and Ashkeinsm^ A. Modification of Cp4 immunoadhesin with monomethoxy- 
PEG aldehyde viai reductive aOdlation. Biocorg. C%ein. 5^ 133-l4o (1994>: 

31. Jin, H., Yang, K., Maistets, S., Bimtjng, S., Wunn, F., Chamow, S., and 
Ashkenazi. A. Protection against rat oidotoxic shock biy p5S tu^or necrosis fkctor 
(TNF) receptor immunoadhesin; comparison to anti-TNF monoclonal antibody. J. 

■ Infect. Diseases no, 1323-1326 (1994), 

32. Beck, J ., Marsters, S., Harris, R., Ashkenazi. A., and Chai^ow, S. Generation of 
soluble interleukin-1 receptor from an immunOadheisinby ^ecific cleavage. MoL- 
Immunol. 31, 1335-1344 (1994). 

.33, Pitti, B., Marsters, M., Haak-Frendscho, M., Osaka, G, Mordenti, J., Chamowj S., 
and Ashkenazi^ A. Molecular and biological properties of an interleukin-i 
recq)tor inununoadhesin. Mo/. /inmMMd/. 31, 1345-1351 (1994). 

34. Oksenberg, D., Havlik, S., Peroutka, S., and AshkeAazl. A. The third intraceUular 
loop of the 5-1112 recieptor specifies effector coupling. J. Neurochem. 64, 1440- 
1447.(1995). ■ 

35. BaiA, E., Szabo, S., Dighe, A., Ashkenazi. A.. Aguet, M., Murphy, K., and 
Schr^er, R.. ligand-induced autoregulation of reenter p chain expres^im 
iii T helper cell subsets. Scfe/ice 27a, 1215t1218 (1995). 

36. Jin, H., Yan^R., Marsters, S., AshVaia^ A , Buiitirig, S.. Matra, M., Scott, R., 
and Baker, J. Protection against endotoxic shodc by bact^cidal/permeability- 
increasinig protein in rat^. 7. Cfin. /nrnt. 95, 1947-1952 (1995). 

37.. Marsters, S., Penica, D., Bach, E., Schreibor, R., and Adik«iazi. A. lateiferon y 
signals via a high^afSnity multisubunit receptor complex that cohfains two types . 
of polypeptide chiun. Proc. Natl. Acad. Set USA, 92, 5401-5405 (1995). 

38. Van Zee, KL, Moldawer, L., Oldenburg, H., Thompson, W., Stackpole, S., 
Montegut, W.. Rogy, M., Meschter, G., Gallati, H., Schiller, C, Richter, W., 
Loetcher, H., Ashkenazi. A .. Chamow, S., Wuim, F., Calvang, S., Lowiy, S„ and 
Lesslauer, W. Protection against letiial E. coli bacterwnia in baboons by 
pretreatment with a 55-kDa TNF recq)tor-Ig fiision proteiui Ro45-2081. J. 
//n/WM/io/. 156, 2221-2230 (199^. / — 

39. Pitti, R., Marsters, S., Ruppert, S., Donahue, C, Moore, A., and Ashkenazi. A . 
Induction of apoptosis by Apo-2 Ligand, a new member of the tumor necrosis 
fector cytokine family. /. Biol Chem. Ill, 12687-12690 (1996). 
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40, Marsters. Pitti, R.. Donahufe. C> Rupert. S,. BauerJlC, and AshlcenayJ^ A ; . 

Activation of apoptosis by Apo-2 ligand is independotit of FADD buf blbdcgd by 

CnnA. Curr. Biol. «, 1669-1676 (1996). 
4L Marsters, S., Skabatch, Gray, C, and Ashkenazi. A . Herpesvims entry 

mediator, a novel member of die tumor necrosis factor receptor familyj activates 

die NF-kB and AP-1 transcription factors- 7. BioL Chem. 272, 14029-14032 

(1997). 

42. Sheridan, J., Marsters, S., Pitti, R,, Gumey, A., Skubatch, M., Baldwin, D., 
Ramakrishnan, L,, Gray, C, Baker, IL, Wood, W J., Goddard, A., Godowskr, P., and 
Ashkenazi, A. Control of TRAIL-ihduced apoptosis by a family of signaling and . 
decoyreceptors.ifc/^/ice 277, 818-821 (1997). , 

43. , Marsters, S,, Shieridan, Jj, 'Pitti, R,, Gumey, A., Skubatch, M., Balswin, D., Huang, A., 
. Yuan, L, Goddard, A., Godowski, P., and Ashkenazi, A. A novel receptor for 

Apo2I/FRAIL contains a truncated death doimain. Curr, BioL 7, 1 003-1006 (1997). 

44. Marsters, A., Sh^dan, J,, Pitti, R., Brush, J., Goddard, A!, and Ashk^azi. A, 
Identificatioii of a ligand for flie death-domain-containing receptor Apo3. Curr. Biol 
8,525-528(1998). 

45. Rieger, J., Naumann, U., Glaser, T,, Ashkenazi. Al , and WeUer, M- Apo2 ligand: 
a novel weapon against malignant glibnia? FEBSLetU 427, 124-128.(1998). 

46. Pender, S., Fell, J., Chamow, S., AsbkenazL A .> and MacDonald, T, A p55 TNF 
reenter immunoadhesin prevents T cell mediated intestinal injury by inhibiting 
matrix metaUoprotdnase production. J. /mmu7{0^ 1^0, 4098-4103 (1998). 

47. Pitti, R,, Masters, S., Lawrence, D., Roy, Kischkel, P., M., Dowd» P., Huang, A*, 
Donahue, C:,Sherwpod,S^,Bddwin,.D., Godowski, P., Wood,^ ' 
BHUan, K,, Cohen, R., Goddard, A., Botstein, D,, and Ashkenazi, A. Genomic ^ 
amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature 
396, 699-703 (1998). 

48. Mori, S., Marakami-Mori, IC, Nakamura, S., Ashkenazl A ., and Bonavida, B. 
Sensitization of ADDS Kaposi's sarcoma cells to Apo-2 ligand-induced apoptosis 
by actinomycinD. y. //Mwuno/. 162, 5616-5623 (1999). 

49. Gumey, A, Marsters, S., Huang, A., Pitti, R,, Mark, M:, Baldwin, D., Gray, A., 
Dbwd, P., Brush, J., Heldens, S:, Schow, P., Goddard, A., Wood, W,, Baker, K., 
Godowski, P., and Ashkenazi> A. Identification of a new member of the tumor 
necrosis fector family and its receptor, a human ortholog of mouse GITR. Curr. 
Bto/. 9, 215-218 (1999). 
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5t>. Ashkena2a, A., Paij R., Fou^ s., Leung; Lawreiiqfe, D., Maisteis, S., Blackie, 
C., Chiang, L., McMurtrey, A., Hebert, A., D^orge, L;, jdiouineiiis, L, Lewis, D., 
Harris, L., Bussiere, J., Koepp^ a, Shahroldi^ Z., and Schwallj R. Safety and 

anti-tumor activity of recombinant soluble Apo21igand. J. Clin.lnvest. 104, 155- 
162(1999). 

51. Chuatharapaii A., Gibbs, V., Lu, J., Ow, A;,Marsters, S., Ashkenazi, A, De Vos, . 
A., Kim, BCJ. Determination of residues involved in ligand binding and signal 
transmissiion in the human IFN-oc receptor 2. / Ijnmunol.,163, 766-773 (1999). 

52. Johnsen, A.-C., Haux, J., Steinkjer, B., Nonstad, U., Egeberg, K., Sundan, A, 
Ashkenazi. A., and Espevilc, T. T?ftgii1atinn nf AjioOi JTR att ^-xpresdon in MK 
cells - involvement in NK cell-mediated cytotoxicity. Cytokine 11, 664-672 
(1999). 

53. Roth, W., Isenmann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, 
AshlrRnayi, A , and Wdl^, M. Bradicadon of intra4ranial human malignant 
glioma xenografts by Apo2I/rRAIL. Btiochem. Biopl^. Res, Commun. 265, 479- 
4)83(1999). 

54. Hymowitz^S.G.,Christinger,H.W.,Fuh,G.,intsch,M.,0'ComieUj^M^^^ 
RJF., j^J^iaa^A and de Vos, AM. Triggering OeU Death 

Structure of Apo2L/TRAIL m a Complex \^th Death Receptor 5. Molec. CeU.4, 
563-571 (1999). 

55. Hymowitz, S.G., O'Connel, MJ»., Utsch, M.H., Hiirst, A., Tolpal/K., Asfikftnari, 
^ de Vos, A.M,, Kelley, R.F. A unique zmc-binding site revealed by a highr 
resolution X-ray structure of hdmotrimeric ApoiUTBJJL. Biochemistry 39, 633- 

•. 640(2000). • ' - ■ V . - 

56. Zhou, Q., Fukushima, P., DeGrafF, W., Mitchell, J,B., Stetler-Stevenson, M., 
Ashkenazi. A., and Steeg, P.S. Radiation and the Apo2L/TRAIL apoptotic 
pathway preferentially inhibit the colonization of premalignant human breast 
cancer cells overexpressing cyclin Dl. Gzncerilej. 60, 2611-2615 (2000). 

57. Kischkel, F.C., Lawrence, D. A., Chuntharapai, A., Schow, P., Kim, J., and 
Ashkenazi. A. App2L/l3RAII^dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 

58. Yan, M., Marsters, S.A, Grewal, I.S., Wang, H., * Ashkenazi. A., and *Dixit, 
V,M. Identification of a receptor for BlyS demonstrates a crucial role in humoral 
imnlmuty.^Afe/Mrc/»^m««o/.. 1,37-41 (2000). 
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59. Masters, S, A., Yan, M., Pitti, RM., Haas, P.E., Dkit, V.M., and AsHcenazi/A. 
Meraction of the Tt^ homologues.BLyS and APRI^ 
homploguesBCMA andTACX Cwrr.^^^^^ ' 

60. Kischkei, RQ, and Ashkenazi, A. Combining enhanced'metabdlic labeling with 
inunimoblotting to detect interactions of endogenous cellular prote 
^lotec/rm^we^ 29, 506-512 (2000). 

61. Lawrence, D., Shahrbkh, Z., Marsters, S,, Achilles, K., Shih, D. Mounho, B., 
Hillan^ K., To^al, K. DeForge, L., Schow, P., Hooliey, I., Sherwood, S., Pai, R,, 
Leung, S., Khan, L., Gliniak, B., Bussiere, L, Smith, C, Strom, S,, Kelley, S:, 
Fox, J,, Thomas, D,, and Ashkenazi. A. Differential hepatocyte toxicity of 
recombinant Apo2L/TRAE. versions, T/atwrcJ^^ . 

62. Chunlharapai, A., Dodge, KL, Grimmer, K, Schroeder, IL, Martsters, SA,, 
Koeppen. Ashkenazi, A » and KinL K,J, Isotype-dependent inhibition of 

. ttunpr growth in vivo by monoclonal antibodies to death receptor 4, J. ImmtmqL 
166, 4891-4898 (2001X 

63. Pollack, LF>; Erf£ M,> and AshkenazL A, Direct stimulation of apoptotic 
signaling by soluble Apo2L/tumor necrosis &(^or-rel^ 

li^d leads to selejctive killmg of glioma cells, Clin. Cancer Res,. 7, 1362-1369 

bool). 

64. ^Wan& HL, Marsters^ S.A., Baker, T*, Chan, B„ Lee, W.P., Fu, L., Tiimas, D., Yan, 
ML, Dixit, V.M., * A^likenazi, A. , and ♦Grewal, LS, TACI-ligahd interactions are 
required for.T cell activation, and collagen-induced aiHuitis in mice. Nature 
/wmttwo/- 2, 632-637 (2001). 

. • ■ - . 65, KischkeU F.C,, Lawrence, D. A*, Tinel, A., Virmani, A-,^Ghow, P., Gazdaf, A*, 
Blenis, J., Amott D., and Ashkenazi. A . Death receptor recruitment of 
endogenous caspase-10 and apoptosis initiation in the absence of caspase-8, J. . 
5zo/. CAew. 276, 46639-46646 (2001). 

66. LeBlanc, H., Lawrence, D.A., Varfolomeev, E„ totpal, K,, Morlai^ J,, Schow, P., 
Fonjg, S., Schwall, R., Sinicropi, D., and Ashkenazi. A Tumor cell resistance to 
death receptor induced 2^optosis through mutational inactivation of die 
proapoptotito Bcl-2 homolog Bax. T/arwra MeJ. 8, 274-281 (26^ 

67. Miller,K.,Meng,G;,Liu,J,,Hurst,A.,Hsei,V.,Wong,W-L.,Ekert,^^ 
Lawrence, D., Sherwood, S., DeForge, L., Gaudreault, Keller, G., iSliwkowski, 
M., Ashkepazi. A ,, and Presta, L. Design, Construction, and analyses of 
multivalent antibodies. J. Immunol. 170, 4854-4861 (2003): 
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^8. Varfolomeev, E., Krschlcel, R, Martm, F,, Wanh, H:, lawrmce, D., Olsson, C„ 
Tom, L, Enctson, S., French, P., Schow. P.. Grewal t and Ashkenarf^ A. 
Immune system development in APRIL knockout mice. Submitted; 

Review articles: ' . 

1. Ashkenazi, A.> Peralta, E., Winslow, Ramachandran^ L, and Capon, D., J* 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 
Spring Harbor Symposium on Quantitative Biology. LIII,v263-272 (1 988). . 

2. Ashkenazi. A .> Peralta, E., Winslow, J., Ramachandran, L, and C£q)on, D. . 
. Functional diversity of muscarinic receptor subtypes in cellular signal 

transduction and growth. Trends Pharmacol ScL t)cc Siq)plement, 12-21 (1989). 

3. Chamow, S., Duliege, A., Ammann,^, Kahn, J., Allen, D., Eichberg, J., Bym, 
R., Capoia, D., Ward, R., and Ashkenazi. A . CD4 immunoadhesins in anti-iflV 

, therapy: new developments. Int. J. Cancer Suppleijient 7, 69-72 (1992). 

4. Ashkenazt, A ;. Capon, and D, Ward, R, fomiunoadhesins. Int. Rev. Immunol 10, 
217-225 (1993)- , 

5. Ashkenazil A ., and P^ta, E. Muscarinic Receptoirs. In Handbook of Receptors 
and Channels. (S. Peroutka, ed), CRC Press, Boqa Raton, VoU, p. Ir27, (1994). 

6- KrantzjS. B,, Means, R. T., Jr., Lina, J., Marsters, S. A., and Ashkenazi^A . 

Inhibition of erythroid colony formation in vitro by gamma interferon. In 

Molecular Biology of Hematopoiesis (N. Abraham, R Shadduck^' A. Levine F. 

Takaku, eds.) Intercept Ltd. Paris, VoL 3, p. 135-147 (1994). 
7. . Ashkenazi. A. C^okine neutralization as a potential therapeutic s^pro 

. * SIRS and shock. J. Biotechhblogy in Healthcare 1, 197-206 (1994). - 
8; : Ashkenazi, A ., and Qiamow, S. M. Immunoadhesins: an alternative to human 

monoclonal antibodies. Immunomethods: A companion to Methods in 

£?zztmo/og); 8, 104-115 (1995). 
9. Chamow, S., and Ashkenazi. A . Immunoadhesins: Principles and Applications: 

Trends Biotech. 14, S1-6Q{\99€). 
10- Ashkenazi. A ., and Chamow, S. M. Immunoadhesins as research tools and 

therapeutic agents. Curr. Opin. Immunol 9, 195-200 (1997). 

1 1. Ashkenazi. A ., and Dixit, V. Death receptors: signaling and modulation. Science 
281,1305-1308(1998), 

12. Ashkenazi, A ., and Dixit, V. Apoptosis control by death and decoy receptors. 
Curr. ppin. Cell Biol 11, 255-260 (1999). 
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13. Ashkenazi. A, Chapters on Apo2I/^U^JL; D]R.4, bR5, DcRl, DcR2; and PcR3. 

Online Cytokine Handbook (ww^apnetcom/cvtokmCTefereb 
14 Ashkenaa, A . Targeting death and decoy recq>tors of ttie tumor necrosis factor 

sapeddmifyJNaturisRev.Ca^ 
is. LeBlanc, H, and Ashk^azi, A Apoptosis signaling by Apo2I/niAIL. Cell Death 

and Differentiation 10, iS6'7S:i^^^ 
16. Almalsan, A and Ashkenazi, A . Ap62L/TRAIL: apoptosis signaling, biology, and 

potential for cancer therapy. Cytokine and Growth Factor Reviews 14, 337-348 
. (2003): 



Book: 

Antibody Fusion Proteins (Chainow, S., and Ashkenazi> A ., eds.^ John Wil^ and . 
Sons Inc.) (1999X 

Talks: 

1 . Resistance of primary HIV isolates to CD4 is mdepCTdent of CD4-gpl26 binding 
affinity. UCSD Symposium, HIV Disease: Pathogenesis and Therapy. 
Gireenelefe, FL, March 1991. 

2. Use ofinmiuno-hybrids to extend the half-Hfe of recep^ IBC conference on 
Biophannaceutical Halflife Extension. New Orleans, LA, Jmie 1992. 

3. Results with TNFrecq>torlD3munoadhesins for 

conference on Bodotoxemia and Sepisjs* Hiiladelphia, PA,7une4992'. '"^ 

4. Immunoadhesins: an altemiative to human antibodies, IBC conference on 
Antibody Engineering. San Diego, CA, December 1993; 

5. Tumor necrosis factor receptor: a potential therapeutic for human septic shock. 
American Society for Microbiology Meeting, Atlanta, GA^ May 1993. 

6. Protective efficiacy of TNF receptor immunoadhesin vs anti-TNF monoclonal 
antibody in a rat model for endotoxic shock. 5th International Congress on TNF. 
Asilomar, CA, May 1994. 

7. Interferon-y signals via a multisubunit receptor complex that qontains two types of 
polypeptide chain. American Association of Immunologists Conference. San 
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HEKr2/neu Breast Cancer Predictive Testing 



?f S!i^!l ' ^"^^^f fif5>eai5.to be increasing -fai fhe 85:7%); Ms is c^mipared to a leomrcace tate' of i 6 7% for 

85%, on demographic gioup,. with a significant nea^tosdcpcndsbjninonfcecHiilcBlw^ a»n«imi 

pefoctttege ofy^aiexp«jei^ fo/Jhe V^is HSH test was gramcd based in dinicS^ 

iwthmlOycaiBof diagnosis. Oneofthef^^ 5iivd1vinal549node.positive^^ 

^cydopbp^phamide, Adiiamycto. and S-flnoromirfi (CAl? 
hasmeta^misizc^. jWostnoderposi^ are given adjn. The smdy ^owed that patients with amplified HEjt»^ 
Wth«py w^chfa^^ However, 20%. benefited from ti^itment witfi higher d^s ot^^^ 

•Wo^^f-patientswithout^iHary-node-in^^^^ i^cdthemi%-whiU.thoscwith«SHEI^2A^^ 
d^i^w^tdisease^and ntt: The study, the^foft idenfiiied a sab^ of wom^iSo 

benefit fiom.ijlcieased surveiUance. early intervention, and did not n«?d to be exposed to the issocl^ side effe^n 

'^d!:..:;.^^:!.* .t' 1. ^ - •addWon.otKercvidenc^fedicatestbatl^ 

PiD^st^ mari^i^ currently used b breast cancer recur- tibii In hode^negathre patients can be used^ an inde^^ 

fence pitsdiction mcludenimor size, histological gmde. steroid prognos^o indicator for eatly t^N^mrenoe. lecunetit <^^»^ 

horrtone jwq>tor status. DNAp^^^ aiv-time.attddisease.related deatk^DcmonstiatlQiiTtf^ 

cathepsin D status, fepression of growth ftctor receptors and ihiksn gene, ampliilcation by HSH has alsoWn shoWn tote 

over-^ression of the HER'2/nca oncogene have also been -^•-t^- « Lis-^-. i^Jj . ^ ^ ^ 

IdentiiBed as having value regarding treatm^t legtmSui mid' 

ipWgnpsis^ ^ . _ . 

«|5r^/n|u (dso kiiOwnras o-eiW&2) ITm oncogene tot 

m»>des a transmembrane glycopr^ElHn that is honidlogous • 

to» but distinct from, ikt epidermal growth factor receptm^. 

l^erous studies.have indicated ttat high levels of expies* 

tion of this protein are associated with hpid tumor growth, 

certain forms' of therapy resistance, imd shorter disease-fiee 

sdnival. The gene has been shown to be amplified and/or 

overdressed in 10%-30% of invasive breast canccn 'and in 

40%-60%.oirintraductal breast cdfcinbma*^ 

There are two distinct FDA-approved methods by.^ch 
. RBR-2/heu states can be evaluated: hnniunobistochemistiy 

GHC, HeiteepTest^ and HSH (fluorescent in situ hybridlza- 
• Hon, PalhVysion^ Kit), Both mclhgds can be.perform.ed on 

aichjved atid current ^edmens. the first method allows visual 

assessment of the amount of HER-2Aiett protein present on 

the cell membrane. The lattermeOioil.allows direct quantili* 

cation of the level of geneamplification pxcsuxt In the tumor^ 

enablmg differentiation between low* versus hJgh-amplifica- 

ticfli. At least oijie study has demon5iiate.d a difference in 



I 

I 



of vahie hi prediptmg response tb chemotherapy in stage-2 



Selection of patie«s^^;Haceptin^{^ 
clond antibody tlieiaR^TSSwar, is bvistd ipm dcmoflSw^-^ 
tiooofHER*2/neQprote|novetejqxr^^ " ' 

Studies uang Herccptin© in patients with metastatic breast 
cancer diow an increase in time to disease progression, 
increased response rat^ to chemptherapeutic agents anda ^nall 
increase in overall ^iyal rate. The FISH assays have nqt yet 
been approved fbr this purpose, and studies lookiog'attttspbnse 
to Herceptm^ in patients wiUi or without gene ampUficatloii ' 
.status determmed by FISH are ha process. 

In geneial, PISH and KCr^lts correlate well, Hcwever, 
subsets of tumors are found which show dlscordarit results; 
ue,, protein oVerexpfession without gene amplificatioh or lade 
Of ptotein cycrexpression \^th gene anq^llficatiomThe dini- 
cal significance. of such resid.t8is.unde8r. Baseil on the above 
W)fldderfttions,HER-2/tt^ testing 
lize immnnohlstochenristiy (Herceplb^ as a screen, fol* 
lowed by FJSH m IHC-negatlve cases. Al^tive^; either 
method may be ordered mdividuaHy ^ependmg on the dinl* 
cal settu^ Ix citaiician preference. 



CPT code Information References 
fiER^Z/neuylalHC I. 
88S42 (including ititerpittiveiep^ . 2 

JIER-lteeD via FISH i 
'£&71 x2 Molecular'x^iogendfcs. DNA ptote, each 
•8©74 Molecular cytogenetics, mterphase in situ hybrid* 

.i^don^finalyze 15-99 cells ' 
. 88291. QtogenetIc$ andmotecQlarcytogenetics^ inteipre^ 

fotioiian(j[npoxt 

Irececlural Information 

ItOTiunohistochemistiy isperfbnn^diK FDA-appioVcd 
jbAKO antiboiiy kit; Heic^test^. Ihe DAKO Idt i^ntaips 
tcageists ftquired iq complete a two-step imnranohlstot^ ' 
dieini jsal sfabbig procbdive fotioiAinely jvocessed, paraj^li^ 
embedded s^cjm^.Fiiitaw^ - 
labbrt antOwi^ to It^^ 

^redd^to^ dcxtratt4>a$cd vlsdalizaition rea^t* TWs le- 
agent consists of both seccmdaiy goat anti^rabbit antibody- ' 
laolecules with hbiseradlsb peroridase'molecides tn&ed.to a 
con^dMm dextiao polymer badcboxie^OKis ditni&atiiig Ihe^ed 
. for seqaentfat f^lication of link, anybody md peroxidase 
* eosijn^ted antibo'ify,. ]Bn2ymatio eomrmion^^f the subse* . 
^emiy odded'^chromogen results h^fcrmatioo- of visible 
reaction product at lUeantigen aite» The spedmen is then coon* 
torstalnol; a paOiologist'tisttig l!ght*inictoscopy hiteiprets . 
.itsolts. 

nSH andysts At SHlidC/PAML la .pnrfonned using the 
FDA-apptovedPathVysion™ HER-2Aieo BNA probe kit, loo- 
^Bced by .Vy8ls»']nc.p4>iiiiali^ foed, paraffin-embedded bteast 
tissue is processed usingioutlne histolo^cd metflo^ and then 
s&des are treated tp aQow h^ridl2ation of DNA ptobes to the 
cocld present m the tissue sbction* ThePa^ivydon^'kh con- 
tains two direct^labeled DKA probes, one spe<^ for the 
elphdd restive DI^ (CEP ITj^^peetni^ : 
ihe dih»nf<;sp.^riH^^^ thd^seboi^ior the HER- 

i^oeu oncogene located at 17ql 1^2-12 (fipecbum 
meratioii of the probes allows a ratio of the numb^ of copies 
of cluoino$ome 1 7 to the number of copies of HER-2/iieu lo . 
be obtained; this gables quantlfi'cation of low versus high 
ampliiication levels, and allows an elstlmate of ^e percentage 
of cells with.HER*2/tieu gene amplification. The cHnically 
relevant distinction Is ^i^iedier the g^e anqilification is doe 
to 2ncr6«scd gene cqjy jDumber on the two chromosome 17 
homolbgues notmaliy present or an increase in the nomber of 
chromosome 17s in the cells. In the minority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
resi4t» ratios of 2.1 and over mdicate that amplificatiqn is 
present and to what degree. Inteipretation 'ofthis data will be 
perfonhed tod leported from the Vysis-ecbtified Cytogenet- 
ics iabomtoiy at SHMC 



IWago, PA, Tcni 5 oM«^ *ttnw 

Cancer lnsmat^I9W. p. 120, \ 

SlaiwH^DJ,aaA,OJil,Sonfi,S.a,l^WJ,lJ^ * 
WX.:"HamQn bteait Csncen CoirvlsdoD of vel^ and swvival with 
uiqilificalioQ of tbebnwl/beu-oneoS'Bn^. SdeAI&Q, 235:JI77*t82, t9S7. 
.Xhte. Oildnist, IC.W.» Hnili^ CP., Samsoo. Mdfim; UK. 
*1^ISH detecHoii of HERrsAwu osicagens ampllficaiksi in csiW ojiset 
breast cancer^, Bte»tC8iiecrRei,Andlteatmeitt19(2):2mi2»]^^^ 
. Press; MJ^ Bcnmdi^ U,Thonw, PA^M^^ 

FhilHis, R.H, Ross. J^^.Wohnan, ]J,R,; Flom, KJ^ "Her^aea^e 
. 8nqi]i{kato ciiavMtcaized by fluoresceiice Is tita bybridizukw: poor 
prDgi»ul8faiiiode-Q^itiv6bRaste«rcfaiomn«', i; Ob^Oiicolosv 
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CHICKEN INTERLEUKm-15 AND USES 
THEREOF 

This is a divisional of U.S. patent application Ser. No. 
09/368,613 filed Aug. 4, 1999, which issued as U.S. Pat. No. 
6,287,554 and is a divisional of U.S. patent application Ser. 
No. 08/729,004, filed on Oct. 10, 1996 and now issued as 
U.S. Pat. No. 6,190,901. U.S. patent application Sen No. 
08/729,004 claims priority to U.S. provisional patent appli- 
cation Serial No. 60/005,682 filed on Oct. 17, 1995. Each of 
these prior applications is hereby incorporated herein by 
reference, in its entirety. 

FIELD OF INVENTION 

The present invention pertains to isolated genes encoding 
avian interleukin-lS and to purified intcrleukin-15 polypep- 
tides. 

BACKGROUND OF THE INVENTION 

Most chickens produced in developed countries for con- 
sumption and egg-laying (at least 10 billion per year) are 
vaccinated to protect them against Marek's disease. All of 
the egg-laying chickens and breeder stocks are also vacci- 
nated with Newcastle Disease Vims, Infectious Bursal Dis- 
ease Wims, Infectious Bronchitis Vims, Fowlpox VIdis and 
Coccidial vaccines. For optimal protection, Marek's vacci- 
nation is performed either at or before hatching. One 
obstacle to the development of efScacious pre-hatching and 
at-batching vaccination regimens is that the embryonic and 
newly hatched avian immune system is not fully developed 
and cannot mount as effective an immune response to the 
immunogen as at 2-3 weeks after hatching. Thus, there is a 
need in the art for agents and compositions that enhance the 
effectiveness of pre- and post-hatching avian vaccines. 

Interieukin-2 and interleuldn-15 are related cytokines that 
stimulate the activity and proliferation of T cells in mam- 
mals. Though IL-2 and IL-15 both interact with the P and y 
chains of the IL-2 receptor, and may share some elements of 
tertiary structure, the two polypeptides are not homologous 
and represent distinct gene products. 

The genes encoding from several different mam- 
malian species share a high degree of homology. For 
example, human and simian IL-15 share 97% amino add 
homology. By contrast, chicken IL-15, which is the subject 
of the present invention, shares only 25% amino acid 
identity with manunalian IL-15. Another distinguishing 
characteristic of chicken IL-15 is that it (and not the mam- 
malian forms) is produced by mitogen-activated spleen 
cells. Accordingly, the discovery of chicken IL-15 and the 
finding that it possesses T cell-stimulatory activity provide 
a novel reagent for vaccine augmentation in avian species. 
Without wishing to be bound by theory, the bioactivity of 
mammalian IL-15 in stimulating skeletal muscle develop- 
ment suggests that avian IL-15s are also useful in stimulat- 
ing growth in avian species. 

SUMMARY OF THE INVENTION 

The present invention provides isolated and purified DNA 
encoding avian interleukin-15 (IL-15), as well as cloning 
and expression vectors comprising IL-15 DNA and cells 
transformed with IL-15-encoding vectors. Avian species 
from which IL-15 may be derived include without limitation 
chicken, turkey, duck, goose, quail and pheasant. 

The invention also provides isolated and purified avian 
11^15 polypeptide, the native secreted or mature form of 
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which has a molecular mass of about 14 kDa, an isoelectric 
point of about 6.57, a net charge of -2, and a hydrophilicity 
index of 0.278, and which has the ability to stimulate 
mitogen-activated avian T cells and to promote the growth 

5 of other cell types. IL-15 according to the present invention 
may be obtained from native or recombinant sources. 

Also encompassed by the invention are sequence- 
conservative and function-conservative variants of avian 
IL-15 DNA and IL-15 polypeptides, including, for example, 

10 a bioactive IL-15 sequence or sub-fragment that is fused 
in-frame to a purification sequence. 

In another aspect, the invention provides a. method for 
enhancing an immune response in fowl to an immunogen, 
which is achieved by administering the immunogen before, 
after, or substantially simultaneously with avian IL-15 in an 
amount effective to enhance the immune response. 

In yet another aspect, the invention provides a vaccine for 
inducing an immune response in fowl to an immunogen, 

^ comprising the immunogen and an effective amount of avian 
interleukin-15 for immune response enhancement. The 
immunogen may be derived, for example, from avian patho- 
gens such as Marek's Disease Virus, Newcastle Disease 
Virus, Infectious Bursal Disease Virus, Infectious Bronchitis 

2^ Virus, and the like. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is an illustration of an 845 nt sequence including 
747 nt of cDNA sequence encoding chicken interleukin-15 
30 (IL-15) SEQ ID NO:L 

FIG. 2 is an illustration of a 143-amino acid sequence 
corresponding to the chicken interleukin-15 precursor 
polypeptide (SEQ ID N0:2). 

35 DETAILED DESCRIPTION OF THE 

INVENTION 

All patent applications, patents, and literature references 
cited in this specification are hereby incoq)orated by refer- 
ence in their entirety. Id case of conflict, the present 
^ description, including definitions, will control. 

The present invention encompasses interleukin-15 (IL- 
15) from avian species. The invention provides isolated and 
purified nucleic acids encoding avian IL-15, as well as IL-15 
polypeptides purified from either native or recombinant 
sources. Avian IL-15 produced according to the present 
invention may be used in conunercial fowl cultivation to 
promote growth and to enhance the efficacy of avian vac- 
cines. 

5Q Nucleic Acids, Vectors, Transform ants 

The sequence of the cDNA encoding chicken IL-15 is 
shown in FIG. 1 (SEQ ID N0:1), and the predicted amino 
acid sequence of chicken IL-15 is shown in FIG. 2 (SEQ ID 
N0:2). The designation of this avian polypeptide as IL-15 is 

55 based on partial amino acid sequence homology to mam- 
malian IL-15 and the ability of the polypeptide to stimulate 
mitogpD-activated T cells (see below). Furthermore, without 
wishing to be bound by theory, it is predicted that avian 
IL-15 polypeptides also exhibit one or more of the following 

go bioactivities: activation of NK (natural killer) cells, stimu- 
lation of B-Cell maturation, proliferation of mast cells, and 
interaction with the beta and gamma subunits of the IL-2 
receptor. 

Because of the degeneracy of the genetic code (i.e., 
65 multiple codons encode certain amino acids), DNA 
sequences other than that shown in FIG, 1 can also encode 
the chicken IL-15 amino acid sequences shown in FIG. 2. 
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Sudi other DNAs include those containing "sequence- response elements, signal sequences, polyadenylation 

conservative" variations in which a change in one or more sequences, introns, 5'- and 3'- noncoding regions, and the 

nucleotides in a given codon results in no alteration in the like. Transcriptional regulatory elements that may be oper- 

amino add encoded at that position. Furthennore, a given ably linked to avian IL-15 polypeptide DNA sequence(s) 

amino acid residue in a polypeptide can often be changed 5 include without limitation those that have the abiHty to 

withoutallcrmgtheoveraU<»nformationandfunctionofdie direct the expression of genes derived from prokaryotic 

native polypepUde Such "function^^^^ variants ^ells, eukaryotic cells, viruses of prokaryotic cells, viruses of 

^tlt 7 H^^^^^^^ '"^t ^^^^^'^ ^^^^ ^ny combination thereof. Other useful 

fnr if Phy^^^'^^^emica such j^eterologous sequences are known to those skiUed in the art. 

as, for example, acidic, basic, hydrophobic, and the like ^.^ ^ , . , 

(e.g., replacement of lysine with aiginine, aspartate with '° ^ Ilie nucleic acids of the present invention can be modifiw^ 

glutamate, or glycine with alanine). In addition, amino add by methods known to those sk^ed m the art to alter their 

sequences may be added or deleted without destroying the '^'^'^7' solubUity, bmding affimty, and specificity. For 

bioactivity of the molecule. For example, additional amino example, the sequences can be selectively methylated The 

add sequences may be added at either amino- or caiboxy- "^'^l^i^ acid sequences of the present mvention may also be 

terminal ends to serve as purification tags, (i.e., to aUow ^^^^.^ ^"^^^ ^^P^^^ providing a detectable 

one-step purification of the protein, after which they may be ^'^nal, either direcUy or mdir«;tly. Exemplary labels include 

chemically or enzymatically removed). Alternatively, the radioisotopes, fluorescent molecules, biotin, and the like, 

additional sequences may confer an additional cell-surface present invention also provides vectors that include 

binding site or otherwise alter the target cell specificity of nucleic adds encoding the avian IL-15 polypeptide(s). Such 

I£^15. ^" vectors include, for example, plasmid vectors for expression 

The chicken IL-15 cDNAs within the scope of the present ^ ^ ^^^^^^^ eukaryotic and prokaryotic hosts. Preferably, 

invention are those of FIG. 1, sequence-conservative variant ^^^^^ ^ '^^^^^^ ^ promoter operably linked to the avian 

DNAs, DNA sequences encoding function-conservative ^^"^^ polypeptide enoodmg portion. The encoded avian 

variant polypeptides, and combinations thereof. The inven- „ ^^"^^ polypepude(s) may be expressed by using any suitable 

tion encompasses fragments of avian interleukin-15 that ''^^^^^ ^^^^ explained herein or otherwise 

exhibit a useful degree of bioactivity, either alone or in ^^"^ 

combination with other sequences or components. As Vectors will often include one or more replication systems 

explained below, it is well within the ordinary skill in the art cloning or expression, one or more markers for selection 

to predictively manipulate the sequence of 11^15 and estab- 30 ^ ^^^^ ^"^^ example, antibiotic resistance, and 

lish whether a given avian IL-15 variant possesses an expression cassettes. The inserted coding 

appropriate stability and bioactivity for a given application. sequences may be synthesized, isolated from natural 

This can be achieved by expressing and purifying the variant sources, prepared as hybrids, or the like. Ligation of the 

11^15 polypeptide in a recombinant system and assaying its coding sequences to the transcriptional regulatory sequences 

T-cell stimulatory activity and/or growth-promoting activity 35 °^ay be achieved by methods known to those skilled in the 

in cell culture and in animals, followed by testing in the Suitable host cells may be transformed/transfected/ 

application. infected by any suitable method including electroporation, 

The present invention also encompasses IL-15 DNAs CaOj- or Hposomc-mediated DNAuptake, fungal infection, 

(and polypeptides) derived from other avian spedes, includ- microinjection, microprojcctile, or the like, 

ing without limitation ducks, turkeys, pheasants, quail and 40 Suitable vectors for use in practicing the present invention 

geese. Avian IL-15 homologues of the chicken sequence include without limitation YEp352, pcDNAI (In Vitrogen, 

shown in FIG. 1 are easily identified by screening cDNA or San Diego, Calif.), pRc/CMV (InVitrogen), and pSFVl 

genomic libraries to identify clones that hybridize to probes (GIBCO/BRL, Gaithersburg, Md.). One preferred vector for 

comprising all or part of the sequence of FIG. 1. use in the invention is pSFVl . Suitable host cells include £. 

Altematively, expression libraries may be screened using 45 ^^^h yeast, COS cells, PC12 cells, CHO cells, GH4C1 cells, 

antibodies that recognize chicken IL-15. Without wishing to BHK-21 cells, and amphibian melanophore cells. BHK-21 

be bound by theory, it is anticipated that IL-15 genes from cells are a preferred host cell line for use in practicing the 

other avian species will share at least about 70% homology present invention. 

with the chidcen IL-15 gene. Also within the scope of the Nucleic acids encoding avian IL-15 polypeptide(s) may 

invention are DNAs that encode chicken homologues of 50 also be introduced into cells by recombination events. For 

IL-15, defined as DNA encoding polypeptides that share at example, such a sequence can be microinjected into a cell, 

least about 25% amino acid identity with chicken IL-15. effecting homologous recombination at the site of an endog- 

Generally, nucleic acid manipulations according to the enous gene encoding the polypeptide, an analog or pseudo- 
present invention use methods that are well known in the art, gene thereof, or a sequence with substantial identity to an 
such as those as disclosed in, for example. Molecular 55 avian IL-15 polypeptide-encoding gene. Other 
Ci(7/iirig, A Lo^oraforyMomui/ (2nd Ed., Sambrook,Fritsch recombination-based methods such as non-homologous 
and Maniatis, Cold Spring Harbor), or Cur/^/i/PmrocQ&r in recombinations, and deletion of endogenous gene by 
Molecular Biology (Eds. Aufubel, Brent, Kingston, More, homologous recombination, espedally in pluripotent cells, 
Feidman, Smith and Stuhl, Greene Publ. Assoc., Wiley- may also be used. 
Interscience, NY, N.Y., 1992). 60 IL-15 Polypeptides 

The present invention encompasses cDNA and RNA The chicken IL-15 gene (the cDNA of which is shown in 

sequences and sense and antisense sequences. The invention FIG. 1) encodes a polypeptide of 143 amino adds (FIG. 2). 

also encompasses genomic avian IL-15 polypeptide DNA Without wishing to be bound by theory, by comparison with 

sequences and flanking sequences, including, but not limited simian IL-15, and by use of an accepted procedure to predict 

to, regulatory sequences. Nucleic acid sequences encoding 65 signal peptidase cleavage sii<tsiyon^ti]nt,Nuc Acids Res., 

avian IL-15 polypeptide(s) may also be associated with 14:4683, 1986), it is predicted that an aminotenninal leader 

heterologous sequences, including promoters, enhancers, sequence of about 22 amino acids (secretion signal peptide) 
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is cleaved from the primary translation product to produce Anti-avian IH5 antibodies may be used to identify and 

mature IL-15. The predicted mature sequence of 121 amino quantify avian IL-15, using immunoassays such as ELISA, 

acids is further characterized by a predicted molecular RIA, and the like. Anti>avian IL-15 antibodies may also be 

weight of 13,971 daltons; an isoelectric point of 6.57; four used to immunodeplete extracts of avian 11^15. In addition, 

cysteme residues (at ammo aads numbers 63, 70, 116, and 5 these antibodies can be used to identify, isolate, and purify 

119 m the precursor IL-15 shown m FIG. 2) that correspond avian IL-15s from different sources, and to perform subcel- 

to four cystemes conserved among human, mouse, and lukr and histochemical localization studies, 

monkey IL-15 and that are believed to participate in Applications 

intramolecular disulfide bonding; and one consensus site for Avian IL-15 produced according to the present invention 

N-linked glycosylation (at asparagine 110 of the sequence 10 can be used beneficially in homologous or heterologous 

shown m FIG. 2) which corrc^onds to a similar site in avian species, for example, to stimulate activated T-ceUs 

human IL-15. (Grabstein et al., Science, 264:965, 1994) and B-cells 

Purification of IL-15 from natural or recombinant sources (Armitage et al, J. Immunol, 154:483, 1995) and/or to 

may be achieved by methods well-known in the art, includ- promote the growth of non-immune cells, such as, for 

ing without limitation ion-exchange chromatography, 15 example, muscle cells (Quinn et al., Endocrinol 136:3669, 

reverse-phase chromatography on C4 columns, gel filtration, 1 995). 

isoelectric focusing, afBnity chromatography, immunoaflan- Vaccines 

ity chromatography, and the like. In a preferred embodiment, The present invention encompasses methods and compo- 

large quantities of bioactive IL-15 may be obtained by sitions for enhancing the efficacy of an immune response in 

constructing a recombinant DNA sequence comprising the 20 avian species. In this embodiment, avian IL-15 is used in 

coding region for IH5 fused in frame to a sequence conjunction with an immunogen for which it is desired to 

encoding 6 C-terminal histidine residues in the pSFVl elicit an inunune response. For example, in avian vaccines, 

replicon (GIBCO/BRL). mRNA encoded by this plasmid is such as those against Marek's disease, Newcastle Disease 

synthesized using techniques well-known to those skilled in and other pathogens such as Infectious Bursal Disease 

the art and introduced into BHK-21 cells by electroporation. 25 Virus and Infectious Bronchitis Virus, it is desirable to 

He cells synthesize and secrete mature glycosylated IH5 include avian IL-15 in the vaccine to enhance the magnitude 

polypeptides containing 6 C-terminal histidines. The modi- and quality of the immune rei^onse. For this purpose, IL-15 

fied IL-15 polypeptides are easily purified from the cell purified from native or recombinant souirces as described 

supernatant by affinity chromatography using a histidine- above is included in the vaccine formulation at a concen- 

binding resin (His-bind, Novagen, Madison, Wis.). 30 tration ranging from about 0.01 to about t.O fxg per 

Avian IL-15 polypeptides isolated from any source can be vaccine per chicken, 

modified by methods known in the art. For example, avian IL-15 may be administered in conjunction with a Uve (i.e., 

IL-15 may be phosphorylated or dephosphorylated, glyco- rephcating) vaccine or a non-replicating vaccine. Non- 

sylated or deglycosylated, and the like. Especially useful are limiting examples of rephcating vaccines are those compris- 

modifications that alter avian IL-15 solubility, stability, and 35 ing native or recombinant vimses or bacteria, such as 

binding specificity and affinity. modified turkey herpesvirus or modified fowlpox vims. 

Anti-II^ 1 5 Antibodies Non-limiting examples of non-replicating vaccines are those 

The present invention encompasses antibodies that are comprising killed or inactivated viruses or other 

specific for avian IL-15 polypeptides identified as described microorganisms, or crude or purified antigens derived from 

above. The antibodies may be polyclonal or monoclonal, 40 native, recombinant, or synthetic sources, such as, for 

and may discriminate avian IL-15s from different species, example, coccidial vaccines. Commercial sources for avian 

identify functional domains, and the like. Such antibodies vaccines include without limitation: Rhone Merieux 

are conveniently made using the methods and compositions Laboratoire-IFFA (Lyon, France); Intervet International BV 

disclosed in Harlow and Lane, Antibodies, A Laboratory (Boxmeer, The Netherlands); Mallinckrodt Veterinary; 

Manual, Cold Spring Harbor Laboratory, 1988, other refer- 45 Solvay Animal Health (Mendota Heights, Minn.); Hocchst- 

ences cited herein, as well as immunological and hybridoma Roussel (Knoxville, Tenn.); and Nippon Zeon Co., Ltd. 

technologies known to those in the art. Where natural or (Kawasaki-Kiu, Japan). 

synthetic avian IL-15-derived peptides are used to induce an In one embodiment, the gene encoding IL-15 is incorpo- 

avian IL-15-specific immune response, the peptides may be rated into a recombinant vims, which is then formulated into 

conveniently coupled to a suitable carrier such as KLH and 50 a live vacdne. The IL-15 gene is incorporated into the virus 

administered in a suitable adjuvant such as Freund*s. so that its expression is controlled by an appropriate pro- 

Prcferably, selected peptides are coupled to a lysine core moter. Administration of the vaccine results in the expres- 

carrier substantially according to the methods of Tam (1988) sion of bioactive IL-15 in close temporal and spatial prox- 

Proc Natl Acad, ScL USA, 85:5409-5413. The resulting imity to the desired immune response, thus enhancing the 

antibodies may be modified to a monovalent form e.g. Fab, 55 vaccine's efficacy. 

FAB', or FV. Anti-idiotypic antibodies, especially internal IL-15 may be administered to birds as part of a vaccine 

imaging anti-idiotypic antibodies, may also be prepared formulation either before or after hatching, preferably before 

using known methods. hatching, using methods known in the art such as those 

In one embodiment, purified avian IL-15 is used to described in U.S. Pat. Nos. 5,034,513 and 5,028,421. 

immunize mice, after which their spleens are removed, and 60 Growth Promotion 

splenocytes used to form cell hybrids with myeloma cells to The present invention provides methods and composi- 
obtain clones of antibody-secreting cells according to tech- tions for enhancing the growth of avian species for medical 
niques that are standard in the art. The resulting monoclonal and/or commercial purposes. In this embodiment, IL-15 is 
antibodies secreted by such cells are screened using in vitro administered to birds using any appropriate mode of admin- 
assays for the following activities: binding to avian IL-15, 65 istration. For growth promotion, IL-15 is administered in 
inhibiting the receptor-binding activity of IL-15, and inhib- amounts ranging from about 0.25 /<g/kg/day to about 25 
iting the T-cell stimulatory activity of IL-15. jUg/kg/day, It will be understood that the required amount of 
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IL-15 can be determined by routine experimentation well- 
known in the art, such as by establishing a matrix of dosages 
and frequencies and comparing a group of experimental 
units or subjects to each point in the matrix. 

According to the present invention, native or recombinant 
avian IL-15 may be formulated with a physiologically 
acceptable carrier, such as, for example, phosphate buffered 
saline or deionized water. The formulation may also contain 
excipients, including lubricant(s), plasticizer(s), colorant(s), 
absorption enhancer(s), bactericide(s), and the like that are 
well-known in the art. The IL-15 polypeptide of the inven- 
tion may be administered by any effective means, including 
without limitation intravenous, subcutaneous, 
intramuscular, transmucosal, topical, or oral routes. For 
subcutaneous administration, for example, the dosage form 
may consist of IL-15 in sterile physiological saline. For oral 
administration, IL-15, with or without excipients, may be 
micro- or macro-encapsulated in, e.g., liposomes and micro- 
spheres. Dermal patches (or other slow-release dosage 
forms) may also be used. 

The following examples are intended to further illustrate 
the invention without limiting its scope thereof. 

EXAMPLE 1 

Cloning of the Chicken IL-15 Gene 

To clone chicken IL-15, a chicken spleen cell cDNA 
library derived from spleen cells that had been activated 
with concanavalin A was utilized (Kaplan, J, Immunol 
151:628, 1993). 5000 colonies were grown overnight at 35"* 
C. on LB agar plates containing 30 /<g/ml ampicillin and 10 
;*gAnl tetracycline. 15-20 colonies were pooled and trans- 
ferred to 10 ml Terrific Broth (containing the same 
antibiotics) and grown overnight. Plasmid DNA fi-om each 
pool was then isolated by published procedures (Maniatis, 
Section 1.28), treated with RNAase (10 /^ml), and stored 
in TE buffer. 

TTie plasmid DNAs were transfected into C0S-7(ArCC) 
cells using Lipofectamine (GIBCO/BRL, Gaithersburg, 
Md.). 1 //g of each plasmid pool was mixed with 3 fA 
Lipofectamine in 100 ;d Opti-MEM medium (GIBCO/ 
BRL), incubated for 30 min, and then placed on COS-7 cells 
that had been grown to 80-90% confluence in 12-well plates 
and rinsed in serum-free medium. The cells and DNA were 
incubated for 5 hrs at 37° C. with Dulbecco's MEM in the 
absence of serum and antibiotics, and then supplemented 
with the same medium containing 10% fetal calf serum and 
incubated overnight at 37° C. The next day, the medium was 
replaced with Dulbecco's MEM containing 10% fetal calf 
serum, penicillin, and streptomycin. After an additional 24 
hrs of incubation, the medium was collected and stored at 
-20° C. 

The cell supematants were tested for IL-15 activity as 
described in Example 2 below. Five pools with the highest 
stimulation indices (1.6 to '2.1) exhibited levels of activity 
that were greater than 2 standard deviations from the mean 
of the remaining 278 pools. Three of the five pools remained 
positive in a second screen, and were subdivided into pools 
of 6. Plasmid DNA extracted from the secondary pook was 
used to transfect COS-7 cells and the supematants were 
tested for IL-2-like activity. As described below in Example 
2, three positive pools were identified and subdivided to 
yield individual clones; from each pool at least one positive 
clone was isolated. 

The complete cDNA inserts of all three positive clones 
were sequenced using the automated Applied Biosystems 
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Model 373A sequencing system. The flanking T7 and SP6 
primers contained in the pcDNAl vector were used to prime 
the sequencing reaction. Two of the clones, B2.16.2 and 
M2.12.1, were identical and coded for the cDNA sequence 

5 shown in FIG, 1. Clone F19.84 was similar to those two 
clones, but was missing the 20 nt at its 5' end (i.e., starting 
at the first ATG of the coding region) and contained a poly 
T tail of at least 100 nt at its 3' end. 
The entire 747 nt sequence (FIG. 1, SEQ ID N0:1) was 

10 analyzed using a BLAST search (which accesses all of the 
major international nucleotide data banks). No significant 
homology was detected with any other known sequence. The 
sequence was also analyzed using the Mac Vector software 
program (MacVector 4.0; International Biotechnologies, 

15 Inc., New Haven, Conn.) on a Mac Ilci computer. This 
analysis revealed an open reading frame flanked at its 5' end 
by a Kozak consensus sequence for translation initiation. 
The predicted amino acid sequence of this open reading 
frame is shown m FIG. 2 (Seq ID N0:2). This amino acid 

20 sequence was analyzed using a BLAST? search (which 
accesses all of the major intemational protein data banks) 
reveahng significant homology with monkey and human 
precursor IL-15. 

The predicted amino acid sequence of chicken IL-15 
consists of a' 143 amino acid polypeptide having a predicted 
molecular weight of 16,305 and an isoelectric point of 6.37. 
Based on the hydrophobicity of its amino terminal end and 
by comparison with known signal peptide cleavage sites 
(von Heijne, Nucleic Acids Res. 14:4683, 1986) it is pre- 
dicted that cleavage between glycine-22 and alanine-23 
results in the removal of an aminoterminal leader sequence 
of about 22 amino acids (secretion signal peptide) from the 
primary product to produce mature IL-15. 
2^ The predicted mature IL-15 sequence of 121 amino acids 
has a predicted molecular weight of 13,971, an isoelectric 
point of 6.57, and a possible N-linked glycosylation site (at 
asparagine 110 of FIG. 2). Comparisons between the pre- 
dicted amino acid sequences of IL-15 from monkey, human, 
mouse and chicken and analysis of the tertiary stmcture of 
monkey IL-15 (Grabstein, Science, 264:965, 1994) suggest 
that four cysteines in chicken IL-15 (positions 63, 70, 116 
and 119 of precursor IL-15, FIG. 2) are conserved and form 
intrachain disulfide bonds. 

EXAMPLE 2 

Bioactivity Assay for Chicken IL-15 

Bioactivity assays for IH5 are performed as follows: 

50 Concanavalin A (ConA)-activated splenic T cells are pre- 
pared by incubating chicken spleen cells (10*^ cells/ml) with 
Con A(10/<g/ml) (Sigma Chemical Cb., St. Louis, Mo.) in 
RPMI 1640 medium (Sigma) containing 2 mg/ml BSA, 
antibiotics and gjutamine at 40° C. for 24 hrs. The medium 

55 is then replaced with Iscoves' medium (Sigma) containing 
2% normal chicken serum (Sigma) and 0.05M alpha-methyl 
pyrannoside (Sigma) for an additional 2-4 days, diluting the 
cells in additional medium as needed. Blast cells are purified 
from this mixture by gently layering them on a Histopaque 

60 density gradient (Sigma) and centrifuging them according to 
the manufacturer's instmctions. The cells are then washed 
three times and finally resuspended in assay medium 
(Iscoves* containing 2% normal chicken serum (Sigma)). 
For the assay, 2x10^* blast cells are placed in roundbottom 

65 96 well plates in assay medium containing IL-15 (such as, 
e.g., dilutions of supernatant from transfected COS-7 cells) 
or appropriate controls. After overnight incubation at 40" C, 
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the cells are pulsed for 6 hrs with ^H-thymidine (0.5 /^Ci) 
(New England Nuclear, Boston, Mass.)+fluorodeoxyuridine 
(10"*M) (Sigma). The cells are then harvested on glass fiber 
filters (Whatman, Clifton, NJ.), and the radioactivity is 
measured in a liquid scintillation counter. IL-15 is expressed 
as a stimulation index, which is the radioactivity in experi- 
mental samples — the radioactivity in controls (non- 
transfected COS-7 supematants). A typical result is shown in 
Table 1. 
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mg total protein, with up to 5% comprising a recombinantly 
expressed and secreted protein. This corresponds to approxi- 
mately 1.25 mg of cIL-15. 

EXAMPLE 4 

Use of Avian IL-15 in Vaccines 

The following experiments are performed to evaluate the 
immune-enhancing activity of chicken IL-15 in chicken 
vaccines. 



TABLE 1 



SOURCE 



OF PLASMID 
DNA 


DNA 

Designation 




Stimulation indices 




1/10 dil* 


1/10 dil'' 


1/33 dil'' 


1/100 dU^ 


PRIMARY 


A19 


1.6 


1.9 


1.3 


1.2 


POOLS 


m 


2.1 


4.2 


2.3 


1.7 




E7 


1.8 


1.7 


1.5 


0.9 




Fa9 


1.8 


3.5 


2.0 


1.2 




M2 


L7 


3.2 


1.9 


1.3 




Ave. of 278 ± SD 


1.1 ± 0.1 










Ave. of 3 Ncg. pools 




1,4 


13 


1.1 


SECONDARY 


A19.7 




0.7 


1.9 




POOLS 


B2.16 




6.0 


3.5 






F19.8 




9.8 


3.4 






M2.12 




3.2 


2.2 




INDIVIDUAL 


B2.16.2 




6.6 


3.3 


Z7 


CLONES 


F19.8.4 




7.5 


4.0 


3.0 




M2.12.1 




7.2 


3.9 


3.6 



'FicBt screening at lAO dil. 

^A repeat transfecdon using 5 positive and 3 negative primary pools 



EXAMPLE 3 

Expression and Purification of IL-15 

To obtain high-level expression of chicken IL-15 in 
mammalian cells, the pSFVl enkaryotic expression vector 
(which includes the Semliki Forest A^rus replicon) is used 
(GEBCO/BRL, Gaithersbuig, Md.). Use of this vector allows 
for signal peptide cleavage, glycosylation, and secretion of 
mature active protein. In one embodiment, the recombinant 
vector encodes an additional six histidine residues at the 
caiboxyterminus of the native IL-15 sequence, allowing the 
efBcient single step purification of the secreted protein on a 
nickel column (Novagen, Madison, Wis.). 

Primers were constructed that include 5' and 3' sequences 
flanking the coding region of IL-15 cDNA. The 3' primer 
also includes nucleotides coding for 6 histidines. These 
primers were used in polymerase chain reaction (PGR), 
using as a template the entire IL-2 cDNA contained within 
the pcDNAl plasmid. The resulting amplified cDNA, 
including the histidine-coding sequences, was ligated into 
the pSFVl plasmid (GIBCO/BRL). The plasmid was 
obtained by transforming DH5 £. coli (GIBCO/BRL) and 
selecting transformants on agar plates and broth containing 
ampicillin. 

This plasmid is used as a template to produce mRNA in 
vitro, using manufacturer's protocols. The mRNA is Irans- 
fected into BHK-21 cells by elcctroporation, using 10 /ig 
RNA per 10'' cells, after which the cells are incubated for 
1-3 days. The cell supernatant is harvested and passed 
through a resin matrix (His-Bind resin; Novagen, Madison, 
Wis.) using a suitable buffer system (His-bind buffer kit; 
Novagen). Up to 20 mg of tagged protein can be purified on 
a single 2.5 ml column. The IL-15 is eluted from the column 
with the elution buffer provided in the kit. It is estimated that 
BHK-21 cells growing in 50 ml medium synthesize about 25 



35 



Chicken IL-15 cDNA is inserted into two viral vectors 
(derived from turkey herpesvirus and fowlpox virus, 
respectively) that are used for the expression of recombinant 
proteins in chickens (Morgan et ai.. Avian Diseases^ 36:858, 
1992; Yanagida et al.,7. Virol, 66:1402, 1992; Nazerian et 
al.,/. Virol 66:1409, 1992). These IL-15-modified live viral 
vectors are administered to newly hatched chicks simulta- 
neously with the administration of various vaccines cur- 
40 rently available. Six days later the chicks are challenged 
with the corresponding virulent viruses and observed for 8 
weeks for the development of disease. The incidence of 
disease in these chicks is compared with controls that do not 
receive the IL-15-modified hve viral vectors. A sample 
45 protocol (including expected results) is shown in Table 2. 

TABLE 2 



50 



55 



60 





Treatment 


Challenge 


% expected 


Group # 


on day 1 


at day 6 


with disease 


1 


none 


none 


0 


2 


none 


vinilcnt Marck's 


>80% 


3 


HVT (not modified) 
HVT-IUS* 


virulent Marek's 


20% 


4 


virulent Marek's 


OtolO% 


5 


HVT (not modified) + 
HVT-11^15 


virulent Marek*s 


0 to 10% 


6 


none 


virulent NDV 


>80% 


7 


HVT-IF-15 


virulent NDV 


30% to >50% 


8 


NDV vaccine 


virulent NDV 


20% 


9 


NDV vaccine + 
HVT-IL-15 


virulent NDV 


etc 10% 
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'heqjesviius of turkeys expressing IL-15 

In an alternative procedure, newly hatched chicks are 
injected intramuscularly with 100 /^g of a plasmid containing 
cDNA for chicken IL-15, using the methods described in 
Uhner, J. B. Science, 259:1745-1749, 1993. These chicks, 
and control chicks receiving a control vector lacking IL-15 
cDNA, are vaccinated oo day 2 with chicken vaccines and 



us 6,7; 

u 

then challenged on day 7 with the corresponding virulent 
viruses. They are observed for 8 weeks for signs of disease. 
It is expected that chicks injected with the pcDNAl vector 
containing IL-15 cDNA will exhibit a reduced incidence of 
disease relative to controls. 

Finally, IL-15 protein purified by the procedure described 
in Example 3 is administered intramuscularly to chicks at 
hatching, followed by a single daily administration on each 
of the following four days. Chicks are divided into three 
groups, receiving 0.01, 0.1 or 1.0 fig per injection per day. 
A fourth group receives placebo injections. At hatching all 
chicks are vaccinated with chicken vaccines and then chal- 
lenged on day 7 with the corresponding virulent viruses. 
They are then observed for 8 weeks for signs of disease. It 
is expected that chicks injected with IL-15 will exhibit a 
reduced incidence of disease relative to controls. 
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EXAMPLES 

Use of Avian IL-15 in Growth Promotion 

^ Mammalian IL-15 stimulates muscle growth (Quinn, L. 
S., Endocrin., 136:3669, 1995) and semi-pure chicken IL-2 
stimulates chicken body weight and increases feed conver- 
sion (U.S. Pat. No. 5,028,421). To evaluate the growth- 

jo promoting activity of avian IL-15, the methods described in 
Example 4 above may be used to administer LL-15 cDNAin 
a viral or plasmid vectors recombinant IL15 protein. Experi- 
mental and control chicks are monitored for weight gain and 
feed conversion for a period of six weeks. It is ejected that 

15 one or more of these protocols will enhance chicken growth 
over controls. 



SEQUENCE LISTING 

<160> NUMBER OF SEQ ID NOSt 2 

<210> SEQ ID NO 1 
<211> LENGTH: 747 
<2I2> TYPES DNA 

<213> ORGANISM: Gallus domesticus 
<400> SEQUENCE: 1 

cagataactg ggacactgcc atgatgtgca aagtactgat ctttggctgt atttcggtag 60 

caacgctaat gactacagct tatggagcat ctctatcatc agcaaaaagg aaacctcttc 120 

aaacattaat aaaggattta gaaatattgg aaaatatcaa gaacaagatt catctcgagc 180 

tctacacacc aactgagacc caggagtgca cccagcaaac tctgcagtgt tacctgggag 240 

aagtggttac tctgaagaaa gaaactgaag atgacactga aattaaagaa gaatt-bgtaa 300 

ctgctattca aaatatcgaa aagaacc-tca agagtcttac gggtctaaat cacaccggaa 360 

gtgaatgcaa gatctgtgaa gctaacaaca agaaaaaatt tcctgatttt ctccatgaac 420 

tgaccaactt tgtgagatat ctgcaaaaat aagcaactaa tcatttttat tttactgcta 480 

tgttatttat ttaattattt aattacagat aatttatata ttttatcccg tggctaacta 540 

atctgctgtc cattctggga ccactgtatg ctcttagtct gggtgatatg acgtctgttc 600 

taagatcata tttgatcctt tctgtaacct acgggctcaa aatgtacgtt ggaaaactga 660 

ttgattctca ctttgtcggt aaagtgatat gtgtttactg aaagaatttt taaaagtcac 720 

ttctagatga catttaataa atttcag 747 



<210> SEQ ID NO 2 
<211> LENGTH: 143 
<212> TYPE: PRT 

<213> ORGANISM: Gallus domesticus 
<40a> SEQUENCE: 2 

Met Met Cys Lys Val Leu He Phe Gly Cys He Ser Val Ala Thr Leu 
15 10 15 

Met Thr Thr Ala Tyr Gly Ala Ser Leu Ser Ser Ala Lys Arg Lys Pro 

20 25 30 

Leu Gin Thr Leu He Lys Asp Leu Glu lie Leu Glu Asn He Lys Asn 
35 40 45 

Lys He His Leu Glu Leu Tyr Thr Pro Thr Glu Thr Gin Glu Cys Thr 
50 55 60 
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-continued 

Gin Gin Thr Leu Gin Cys Tyr Leu Gly Glu Val Val Thr Leu Lys Lys 

65 70 75 80 

Glu Thr Glu Asp Asp Thr Glu He Lys Glu Glu Phe Val Thr Ala He 
85 90 95 

Gin Asn He Glu Lys Asn Leu Lys Ser Leu Thr Gly Leu Asn His Thr 
100 105 110 

Gly Ser Glu Cys Lys He Cys Glu Ala Asn Asn Lys Lys Lys Phe Pro 
115 120 125 

Asp Phe Leu His Glu Leu Thr Asn Phe Val Arg Tyr Leu Gin Lys 
130 135 140 



What is claimed is: 

1. Ad isolated nucleic acid which: 

(a) comprises a nucleic acid sequence having at least 70% 
sequence homology, determioed by a BLAST 
algorithm, to the sequence set forth in nucleotides 
87-449 of SEQ ID N0:1; and 

(b) encodes a polypeptide capable of stimulating thymi- 
dine incorporation in mitogen activated avian T-cells. 25 

2. The complement of a nucleic acid according to claim 1. 

3. An isolated nucleic acid according to claim 1 or 2, 
which nucleic acid is an avian nucleic acid isolated from 
chicken. 

4. A vector construct comprising the nucleic acid of claim ,0 

1. 

5. The vector construct according to claim 4, in which said 
nucleic add is operatively associated with a promoter ele- 
ment capable of expressing the nucleic acid in a host cell. 

6. The vector construct according to claim 4, in which the 35 
construct is a recombinant virus. 

7. The vector construct according to claim 6, in which the 
recombinant virus is a turkey herpes virus or a fowl pox 
vims. 

8. An isolated nucleic acid which: 

(a) hybridizes to the full length of a nucleic acid having 
the complementary sequence of nucleotides 87-449 in 
SEQ ID N0:1 under conditions comprising (i) hybrid- 
ization in 6xSSC and 0.5% SDS, and (ii) washing at 
68° C. in OlxSSC and 0.5% SDS; and 45 

(b) encodes a polypeptide capable of stimulating thymi- 
dine incorporation in mitogen activated avian T-cells. 

9. The complement of a nucleic acid according to claim 8. 

10. An isolated nucleic acid according to claim 8 or 9, 
which nucleic acid is an avian nucleic acid isolated from 50 
chicken. 

U. An isolated nucleic acid which: 

(a) hybridizes to the full length of a nucleic acid having 
the complementary sequence of nucleotides 87-449 in 
SEQ ID NO:l under conditions comprising (i) hybrid- 
ization in 6xSSC and 0.5% SDS, and (ii) washing at 
room temperature in 2xSSC and 0.5% SDS; and 

(b) encodes a polypeptide capable of stimulating thymi- 
dine incorporation in mitogen activated avian T-cells. 



12. The complement of a nucleic acid according to claim 
11. 

13. An isolated nucleic acid according to claim 11 or 12, 
which nucleic acid is an avian nucleic acid isolated from 
chicken. 

14. A vector construct comprising the nucleic acid of 
claim 8 or 11. 

15. A The vector construct according to claim 14, in which 
said nucleic acid is operatively associated with a promoter 
element capable of expressing the nucleic acid in a host cell. 

16. llie vector construct according to claim 14, in which 
the construct is a recombinant virus. 

17. The vector construct according to claim 16, in which 
the recombinant virus is a turkey herpes virus or a fowl pox 
virus. 

18. An isolated nucleic acid having an open reading frame 
that encodes a polypeptide comprising the sequence of 
amino acid residues 23-143 set forth in SEQ ID N0:2 (FIG. 
2). 

19. The complement of a nucleic acid according to claim 
18. 

20. An isolated nucleic acid according to claim 18 or 19, 
which nucleic acid is an avian nucleic acid isolated from 
chidcen. 

21. An isolated nucleic acid according to claim 18, 
wherein the polypeptide comprises the amino acid sequence 
set forth in SEQ ID NO:2 (FIG. 2). 

22. The complement of a nucleic acid according to claim 
21. 

23. An isolated nucleic acid according to claim 21 or 22 
which nucleic acid is an avian nucleic acid isolated from 
chicken. 

24. A vector construct comprising the nucleic acid of 
claim 18 or 21. 

25. The vector construct according to claim 24, in which 
said nucleic acid is operatively associated with a promoter 
element capable of expressing the nucleic acid in a host cell. 

26. The vector construct according to claim 24, in which 
the construct is a recombinant virus. 

27. The vector construct according to claim 26, in which 
the recombinant virus is a turkey herpes virus or a fowl pox 
virus. 

* ♦ * * * 
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ABSTRACT 



The invention provides a novel protein, PX3.101, which can 
be isolated from honey bee venom, antibodies against the 
polypeptide and nucleic acids encoding PX3.101 and frag- 
ments thereof. The invention also provides pharmaceutical 
compositions based upon PX3.i01 polypeptide and methods 
for using same in the treatment of various diseases, includ- 
ing various inflammatory diseases such as rheumatoid arthri- 
tis. 

24 Claims, 9 Drawing Sheets 
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BEE VENOM PROTEIN AND GENE 
ENCODING SAME 

CROSS REFERENCES TO RELATED 
APPLICAnONS 

This application claims the benefit of U.S. Provisional 
Application No. 60/100,172, filed Sep. 14, 1998, which is 
incorporated herein by reference in its entirety for all pur- 
poses. 

FIELD OF THE INVENTION 

This invention relates to the field of cloning and expres- 
sion of a protein with therapeutic value in the treatment of 
various diseases, especially inflammatory diseases such as 
rheumatoid arthritis. More specifically, the invention relates 
to a novel protein called PX3.101 purified from honey bee 
venom and the gene encoding the protein. 

BACKGROUND OF TIIE INVENTION 

The immune system plays a critical beneficial role in 
combating infections. However, in some instances improper 
immune responses can result in many disabling diseases. 
Autoimmune or immune-system mediated diseases may be 
either B-cell mediated (i.e., antibody-mediated) or T-ceU 
mediated. Many autoimmune diseases involve an undesir- 
able inflammatory response. Examples of such diseases 
include rheumatoid arthritis, chronic hepatitis, Crohn's 
disease, psoriasis, vasculitis, and the like. 

Existing therapies for autoimmune diseases, particularly 
those involving an undesirable inflammatory response are 
inadequate. Most immune system-mediated diseases are 
chronic conditions that require the prolonged administration 
of drugs to address the symptoms of the disease. 
Accordingly, an important criterion for drugs used to treat 
these diseases is low toxicity. However, many drugs utilized 
to treat autoimmune diseases (e.g., steroids and non- 
steroidal anti-inflammatory compounds (NSAIDs)), have 
significant toxic side effects that become manifest after 
prolonged periods of use. Various immunosuppressive drugs 
(e.g., cyclosporin A and azathioprine) have also been used to 
treat autoimmune diseases. However, these compounds are 
relatively non-specific and have the adverse ellect of weak- 
ening the entire immune system, thus leaving the patient 
susceptible to infectious disease. 

A variety of inflammatory diseases, including rheumatoid 
arthritis, are associated with interleukin 8 (IL-8), IL-8 is a 
chcmokinc that promotes the recruitment and activation of 
neutrophil leukocytes and represents one of several endog- 
enous mediators of acute inflammatory response. IL-8 has 
also been variously referred to as neutrophil-activating 
factor, monocyte -derived neutrophil chemotactic factor, 
interleukin-8 (IL-8) and neutrophil-activating peptide. The 
term IL-8 has gained the most widespread acceptance and is 
used herein. 

Inflammation and autoimmune responses commence with 
the migration of leukocytes out of the microvascular into the 
extravascular space in response to chemoattractant mol- 
ecules. Chemoattractants may originate from the host and 
include chemokines and activated complement components, 
or may be released from an invading organism. Once 
exposed to chemoattractants within the vasailature, the 
leukocytes become activated and capable of adhering to the 
endothelium providing the first step in the development of 
inflammation. Stimulated neutrophils adhere to the endot- 
helium of the microvasculature in response to a gradient of 
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chemoattractants which direct the cells into the extravascu- 
lar space toward the source of the chemoattractant. See, for 
example, Anderson et dH., Journal Clin. Invest. 74:536-551, 
(1984); Ley, K. et Blood 77:2553-2555, (1991); Paulson, 

5 J. C, "Selectincarbohydrate-mediated adhesion of 
leukocytes", Adhesion: Its Hole in Inflammatory Disease, W. 
H. Freeman, 1992; La.sk y, L. A., "1lie homing receptor" 
(LECAM 1/L-se lectin). Adhesion: Its role in inflammatory 
disease, W. H. Freeman, (1992). 

10 Rheumatoid arthritis is one of the more prevalent autoim- 
mune and inflammatory diseases. The disease afllicts 
approximately 1% of the total population and about 2.5 
million persons in the United States alone. Direct prescrip- 
tion usage is estimated at $5.6 billion worldwide. For 

3^ individuals suffering from rheumatoid arthritis, the individu- 
al's immune system mistakenly perceives the body's own 
joint tissue as foreign and thus initiates an abnormal immune 
response. The disease is characterized by chronic 
inflammation, destruction of cartilage, and ultimately bone 

20 erosion and the destruction of joints. 

As with other inflammatory diseases, known treatments 
for IL-8 mediated diseases and rheumatoid arthritis can 
include the use of nonspecific immunosuppressive drugs that 
suppress the entire immune system; as noted above, 
however, such treatments put the patient at risk for contact- 
ing an infectious disease. Prolonged u.se of such drugs can 
also result in severe side eflecls. Moreover, immunosup- 
pressive drugs are only partially effective in mitigating the 
symptoms of rheumatoid arthritis and the utility of the 
treatment tends to decrease with time. 

Other therapies currently used are non-steroid anti- 
inflammatory drugs (NSAIDs), corticosteroids and a variety 
of disease modifying anti-rheumatic drugs (DMARDs). 
There is general dissatisfaction with these drugs for two 
major reasons: (i) incidence of adverse side effects, which 
lead to over 700, 000 hospitalizations every year, and (ii) 
inability to reverse disease progression. 

Given the paucity of effective treatments for inflammatory 
diseases and autoimmune diseases generally, and the need 
for effective compositions for treating diseases associated 
with IL-8 such as rheumatoid arthritis more particularly, 
there is a significant need for new substances that can be 
used in the treatment of these diseases. 

45 The present invention provides novel isolated proteins 
and nucleic acids encoding the proteins that are effective in 
treating autoimmune and inflammatory diseases, especially 
rheumatoid arthritis. The peptides of the invention also can 
inhibit the binding of IL-8 to its receptor and inhibit a variety 

5Q of enzymes associated with inflammatory diseases. 

SUMMARY OF THE INVENTION 

The invention provides nucleic acid molecules that 
include a polynucleotide sequence that encodes a PX3.101 

55 polypeptide or fragments thereof. The polypeptides of the 
invention have an amino acid sequence at least 75% iden- 
tical to an amino acid sequence as set forth in SEQ ID N0:2 
over a region at least about 40 amino acids in length when 
compared using the BLASTP algorithm with a wordlength 

60 (W) of 3, and the BLOSUM62 scoring matrix. The poly- 
nucleotide sequences are preferably at least 75% identical to 
a nucleic acid sequence set forth in residues 74 to 349 of 
SEQ ID N0:1 over a region of at least 50 nucleotides in 
length when compared using the BLASTN algorithm with a 

65 wordlength (W) of 11, M=5, and N=-4. 

Nucleic acids of the invention also include isolated 
nucleic acid molecules comprising a nucleotide sequence 
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selected from the group consisting of: (a) deoxy ribonucle- 
otide sequence complementary to nucleotides 74 to 349 of 
SEQ ID N0:1; (b) a ribonucleotide sequence complemen- 
tary to nucleotides 74 to 349 of SEQ ID N0:1; (c) a 
nucleotide sequence complementary to the deoxyribonucle- 
otide sequence of (a) or to the ribonucleotide sequence of 
(b); (d) a nucleotide sequence of at least 23 consecutive 
nucleotides capable of hybridizing to nucleotides 74 to 349 
of SEQ ID N0:1; and (e) a nucleotide sequence capable of 
hybridizing to a nucleotide sequence of (d). The nucleic acid 
molecules of the invention will generally hybridize to a 
polynucleotide sequence consisting of nucleotides 74 to 349 
of SEQ ID N0:1 under stringent conditions. An exemplary 
nucleic acid of the invention is a nucleic acid consisting of 
nucleotides 74 to 349 of SEQ ID N0:1. Nucleic acids of the 
invention also include those which are capable of being 
amplified with forward primer 5' AAGGATCCACAGTG- 
CAACGTAAGrrC 3' (SEQ ID N0:3) and reverse primer 5' 
ACTGAIAAAATAATAAC y (SEQ ID N0:5). 

'Ilie invention also provides polypeptides that have an 
amino acid sequence at least 75% identical to an amino acid 
sequence as set forth in SEQ ID NO: 2 over a region at least 
40 amino acids in length when compared using the BLASTP 
algorithm with a wordlength (W) of 3, and the BLOSUM62 
scoring matrix. Polypeptides of the invention include 
polypeptides encoded by a nucleic acid segment that hybrid- 
izes under stringent conditions to a nucleic acid fragment 
having the sequence set forth in SEQ ID N0:1. Polypeptides 
that are also included are those having an antigenic deter- 
minant common to a polypeptide comprising the amino acid 
sequence set forth in SEQ ID NO:2. An example of a 
polypeptide of the invention is a polypeptide having the 
sequence set forth in SEQ ID N0:2. The invention further 
provides polypeptide fragments that include at least 12 
contiguous amino acids from SEQ ID N0:2. Other polypep- 
tides provided by the invention are purified polypeptides 
which include a signal peptide, at least 3 GGX repeats, and 
a C terminal segment extending from the last GGX repeat to 
the C-terminus which contains at least 7 cysteine residues, 
wherein X is any amino acid and the polypeptide is less than 
140 amino acids in length. 

The invention also includes cells that include a vector 
containing a nucleic acid of the invention. For example, the 
invention provides cells that have a recombinant expression 
cassette containing a promoter operably linked to a poly- 
nucleotide sequence which encodes a polypeptide as 
described herein. Both prokaryotic and eukaryotic cells that 
express polypeptides of the invention are provided. 

Methods for producing a polypeptide comprising the 
amino acid sequence of SEQ ID N0:2 or fragments thereof 
arc also provided. The methods generally include culturing 
a host cell containing a recombinant expression cassette 
under conditions suitable for the expression of the polypep- 
tide and then recovering the polypeptide from the host cell 
culture. 

The invention further provides antibodies that are specific 
for the polypeptides and polypeptide fragments of the inven- 
tion. 

A variety of pharmaceutical compositioas are provided by 
the invention. These compositions typically contain a 
polypeptide as described herein and a pharmaceutically 
acceptable excipient. In some instances, the compositions 
also include a complementary agent which is known to be 
effective in treating inflammatory diseases. Various compo- 
sitions can be used to treat various diseases, including, for 
example, inflammatory diseases, cancer, autoimmune 
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diseases, pain, and diseases associated with chemokine 
imbalances. These methods generally involve administering 
a therapeutically eflfective dose of one of the pharmaceutical 
compositions of the invention to a patient suffering from a 
5 disease. 

The invention further provide methods for inhibiting the 
interaction between certain chemokines with their receptors 
or for inhibiting enzymes associated with inflammatory 
diseases. ITiese method generally involve admixing a 
10 polypeptide of the invention with a solution containing a 
chemokine and its receptor or a solution containing an 
enzyme associated with inflammatory diseases. 

BRIEP DESCRIPTION OF TIIE DRAWINGS 

FIG. 1 shows an elution profile for a honey bee venom 
suspension eluted through a Sephadex G-50 sizing column. 
One ml of honeybee venom suspension (approximate 0.5 g 
solid material) was diluted in 10 ml of colimin buffer 
(ammonium formate buffer, 0.1 M, pH 4.6), spun down and 

20 filtered through a 0.45 fim filler. 'Ilie resulting solution was 
loaded onto a Sephadex G-50 column (two connected 
columns, each 1.5x170 cm (diameterxheight)) pre- 
equilibrated with the column buffer. The column was eluted 
at about 0.6 ml/min, and fractions of 100 drops 

25 (approximately 4.0 ml) were collected. 

FIG. 2 shows elution profiles for the three HPLC purifi- 
cation steps used to purify PX3.101 protein. 

FIG. 2A is an HPLC elution profile for fractions obtained 
from the G-50 sizing column. Fractions from the G-50 sizing 
columns such as shown in FIG. 1 were tested for the 
presence of PX3.101 by SDS-PAGE. Positive fractions were 
pooled and loaded onto a Reverse Phase (RP) HPLC colimin 
(semi-prep C-18 column). The column was eluted with an 
acetonitrile gradient (see Reverse Phase HPLC section in 
Example I for detailed information). PX3. 101 -containing 
fractions were collected and freeze-dried. 

FIG. 2B is an elution profile showing further purification 
of PX3.101 by ion exchange HPLC in which PX3.101 

40 powder from the first RP-HPLC chromatography purifica- 
tion step was dissolved in 0.1 M ammonium formate (pH 
5.8) and loaded onto an ion exchange HPLC column (see 
Example 1 for details). 

FIG. 2C is an example of the elution profile for the final 

45 purification step of PX3.101 using a second RP-HPLC 
column. PX3.101 fractions from the ion exchange HPLC 
purification shown in FIG. 2B were pooled and loaded onto 
another RP-HPLC column. Shown is the chromatography of 
the purification of PX3.101 from 2 g dry honeybee venom. 

50 PX3.101 fractions (Puri-#1, Puri-#2 and Puri-#3) are indi- 
cated on the profile. The mixture of Puri-l, Puri-2 and Puri 
-3 was used for animal studies and mechanism studies. The 
differences between PX3.101 fractions Puri-#1, Puri-#2 and 
Puri-93 are discussed in the text. 

55 FIG. 3 A shows the fuU -length nucleotide sequence for the 
cDNA encoding full-length PX3.101 (SEQ ID N0:1) and 
the predicted protein sequence of PX3.101 (SEQ ID N0:2). 
The nucleotide sequence is the number listed on the left in 
plain type; the sequence begins with the first nucleotide of 

60 the PX3.101 cDNA. Amino acids are numbered in italics on 
the right. The in- frame stop codon is denoted by an asterisk. 
Four peptide sequences obtained from peptide .sequencing 
are underlined. 
FIG. 3B is a schematic representation of the PX3.101 

65 protein structure showing the signal peptide region, the 
region containing Gly Gly Xaa (where Xaa^any amino acid 
and Gly=Glycine) repeats and the cysteine rich region. 
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FIG. 3C is a schematic representation of the structures of 
PX3.101 protein and its potential homologues. The number 
of the amino acids and accession numbers are included. 

FIG. 4 is a chart showing the effectiveness of PX3.101 in 
controlling inflammation in the CIA (collagen-induced ^ 
arthritis) mouse animal model. Indomethacin and bee venom 
serve as positive controls. PBS serves as the negative 
control. Severity is a measure of the degree of inflammation 
measured in each treatment group (see Example V for 
details). 

FIG. 5 includes photographs of representative joint tissues 
from mice in different treatment groups. 

FIG. 5 A (normal control) shows a joint for a mouse that 
was not injected with collagen to induce rheumatoid arthritis 
but which was injected with phosphate buffer in normal 
saline (PBS). 

FIG. 5B (negative control) is a photograph of the joint of 
a mouse which was injected with coUagen to induce rheu- 
matoid arthritis and also injected with PBS. 20 

FIG. 5C is a photograph of the joint of a mouse which was 
injected with collagen to induce rheumatoid arthritis and 
also injected with a solution containing PX3.101 (200 
jiig/kg). Additional details are provided in Example V. 

FIG. 6 is a chart showing the effectiveness of various 25 
concentrations of PX3.101 in controlling inflammation in 
the CIA (collagen-induced arthritis) mice model. Three 
doses of PX3.101 (8 //g/kg, 40 //g/kg, and 200 /vg/kg) were 
tested. Severity is a measure of the degree of inflammation 
measured in each treatment group (see Example V for 
additional details). 

FIG. 7A is a chart showing inhibition of binding between 
the chemokine IL-8 with its receptor CXCR2 at different 
concentrations of PX3.101 purified from honeybee venom. 
Purified PX3.101 was added to 0.2 ml reaction solution to 
give final PX3.101 concentrations of 0, 0.01, 0.1, 1.0 and 10 
fiM (Columns 1 through 5, respectively). Reaction mixtures 
also contained 0.15 mg/ml membrane preparation of human 
recombinant CHO cells expressing CXCR2, 0.015 nM ^^I 
IL-8, and 10 nM unlabeled IL-8, and were incubated for 60 
minutes at room temperature. Bound radioligand was sepa- 
rated from unbound radioligand and the radioactivity 
counted on a gamma counter. The ligand bound was nor- 
malized and calculated as a percentage of the ligand bound 
in solutions without PX3.101. The effects of the purified 
PX3.101 on the binding of lL-8 to the receptor CXCRl 
(Columns 6:0 //M; and column 7:10 /fM)and of TNF-a 
(tumor necrosis factor -a) to TNF-a receptor are also shown 
(Column 8:0 //M; and column 9:10 /iM). Details concerning 
the assay methods arc described in Example VI. The results 
represent the averages of two measurements. 

FIG. 7B shows inhibition plots demonstrating inhibition 
of IL-8/CXCR2 interaction by recombinant PX3.101 protein 
(•) and native PX3.101 protein from honeybee venom (A). 

FIG. 8 is a schematic of the expression vector constructed 
to produce recombinant PX3.101 protein. 

DEHNITIONS 

The term "nucleic acid" refers to a deoxyribonucleolide or 60 
ribonucleotide polymer in either single- or double-stranded 
form, and unless otherwise limited, encompasses known 
analogues of natural nucleotides that hybridize to nucleic 
acids in a manner similar to naturally-occurring nucleotides. 
Unless otherwise indicated, a particular nucleic acid 65 
sequence includes the complementary sequence thereof. A 
"subsequence" refers to a sequence of nucleotides or amino 
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acids that comprise a part of a longer sequence of nucle- 
otides or amino acids (e.g., a polypeptide), respectively. 

A "probe" is an nucleic acid capable of binding to a target 
nucleic acid of complementary sequence through one or 
more types of chemical bonds, usually through complemen- 
tary base pairing, usually through hydrogen bond formation, 
thus forming a duplex structure. The probe binds or hybrid- 
izes to a "probe binding site." A probe may include natural 
(i.e. A, G, C, or T) or modified bases (7-deazaguanosine, 
inosine, etc.). A probe can be an oligonucleotide which is a 
singlc-strandcd DNA. Oligonucleotide probes can be syn- 
thesized or produced from naturally occurring polynucle- 
otides. In addition, the bases in a probe can be joined by a 
linkage other than a phosphodiester bond, so long as it does 
not interfere with hybridization. Thus, probes may be pep- 
tide nucleic acids in which the constituent bases are joined 
by peptide bonds rather than phosphodiester linkages (see, 
for example, Nielsen et al.. Science 254, 1497-1500 
(1991)). Some probes may have leading and/or trailing 
sequences of noncomplemenlarity flanking a region of 
complementarity. 

The terms "polypeptide," "peptide" and "protein" are 
used interchangeably to refer to a polymer of amino acid 
residues. The term also applies to amino acid polymers in 
which one or more amino acids are chemical analogues of a 
corresponding naturaUy-occurring amino acids. 

The term "opcrably linked" refers to functional linkage 
between a nucleic acid expression control sequence (such as 
a promoter, signal sequence, or array of transcription factor 
binding sites) and a second polynucleotide, wherein the 
expression control sequence affects transcription and/or 
translation of the second polynucleotide. 

A "heterologous sequence" or a "heterologous nucleic 
acid," as used herein, is one that originates from a source 
foreign to the particular host cell, or, if from the same 
source, is modified from its original form. Thus, a heterolo- 
gous gene in a prokaryotic host cell includes a gene that, 
although being endogenous to the particular host cell, has 
been modified. Modification of the heterologous sequence 
can occur, e.g., by treating the DNA with a restriction 
enzyme to generate a DNA fragment that is capable of being 
operably linked to the promoter. Techniques such as site- 
directed mutagenesis are also useful for modifying a heter- 
ologous nucleic acid. 

The term "recombinant" when used with reference to a 
cell indicates that the ceU replicates a heterologous nucleic 
acid, or expresses a peptide or protein encoded by a heter- 
ologous nucleic acid. Recombinant cells can contain genes 
that are not found within the native (non-recombinant) form 
of the cell. Recombinant cells can also contain genes found 
in the native form of the cell wherein the genes are modified 
and re-introduced into the cell by artificial means. The term 
also encompasses cells that contain a nucleic acid endog- 
enous to the cell that has been modified without removing 
the nucleic acid from the ceU; such modifications include 
those obtained by gene replacement, site-specific mutation, 
and related techniques. 

A "recombinant expression cassette" or simply an 
"expression cassette" is a nucleic acid construct, generated 
recombinantly or synthetically, that has control elements 
that are capable of effecting expression of a structural gene 
that is operably linked to the control elements in hosts 
compatible with such sequences. Expression cassettes 
include at least promoters and optionally, transcription ter- 
mination signals. Typically, the recombinant expression 
cassette includes at least a nucleic acid to be transcribed 



us 6,395306 Bl 

7 8 

(e.g., a nucleic acid encoding a desired polypeptide) and a Optimal alignment of sequences for comparison can be 

promoter. Additional factors necessary or helpfiil in effect- conducted, e.g., by the local homology algorithm of Smith 

ing expression can also be used as described herein. For & Waterman, Arfu App/l Matk 2:482 (1981), by the homol- 

example, an expression cassette can also include nucleotide ogy alignment algorithm of Needleman & Wunsch, J. MoL 

sequences that encode a signal sequence that directs secre- 5 BioL 48:443 (1970), by the search for similarity method of 

tionof an expressed protein from the host cell. Transcription Pearson & Lipman, Proc, Nat'L Acad, Sci, USA 85:2444 

termination signals, enhancers, and other nucleic acid (1988), by computerized implementations of these algo- 

sequences that influence gene expression, can also be rithms (GAP, BESTFIT, FASTA, and TFASTA in the Wis- 

included in an expression cassette. consin Genetics Software Package, Genetics Computer 

The term "isolated," "purified" or "substantially pure" Group, 575 Science Dr., Madison, Wis.), or by visual 

means an object species (e.g., PX3.101 polypeptide or inspection (see generally Ausubel et al., supra), 

fragments thereof, or a nucleic acid fragment) is the pre- Another example of algorithm that is suitable for deter- 

dominant macromolecular species present (i.e., on a molar mining percent sequence identity and sequence similarity is 

basis it is more abundant than any other individual species the BLAST algorithm, which is described in Altschul et al., 

in the composition), and preferably the object species com- MoL BioL 215:403-410 (1990). Software for performing 

prises at least about 50 percent (on a molar basis) of aU BLAST analyses is publicly available through the National 

macromolecular species present. Generally, an isolated. Center for Biotechnology Information (http:// 

purified or substantially pure composition will comprise www.ncbi.nlm.nih.gov/). This algorithm involves first iden- 

more than 80 to 90 percent of aU macromolecular species lifying high scoring sequence pairs (HSRi) by identifying 

present in a composition. Most preferably, the object species short words of length W in the query sequence, which either 

is purified to essential homogeneity (i e., contaminant spe- match or satisfy some positive-valued threshold score T 

cies cannot be detected in the composition by conventional when aligned with a word of the same length in a database 

detection methods) wherein the composition consists essen- sequence. T is referred to as the neighborhood word score 

tiaUy of a single macromolecular species. threshold (Altschul et al, supra.). These initial neighborhood 

The term "complementary" means that one nucleic acid is 25 word hits act as seeds for initiating searches to find longer 

identical to, or hybridizes selectively to, another nucleic acid HSPs containing them. The word hits are then extended in 

molecule. Selectivity of hybridization exists when hybrid- both directions along each sequence for as far as the cumu- 

ization occurs that is more selective than total lack of lative alignment score can be increased. Cumulative scores 

specificity. Typically, selective hybridization will occur are calculated using, for nucleotide sequences, the param- 

when there is at least about 55% identity over a stretch of at 33 eters M (reward score for a pair of matching residues; 

least 14-25 nucleotides, preferably at least 65%, more always >0) and N (penahy score for mismatching residues; 

preferably at least 75%, and most preferably at least 90%. always <0). For amino acid sequences, a scoring matrix is 

Preferably, one nucleic acid hybridizes specifically to the used to calculate the cumulative score. Extension of the 

other nucleic acid. See M. Kanehisa, Nucleic Acids Res. word hiLs in each direction are halted when: the cumulative 

12:203 (1984). 35 alignment score falls off by the quantity X from its maxi- 

The terms "identical" or percent "identity," in the context mum achieved value; the cumulative score goes to zero or 
of two or more nucleic acids or polypeptides, refer to two or below, due to the accumulation of one or more negative- 
more sequences or subsequences that are the same or have scoring residue alignments; or the end of either sequence is 
a specified percentage of nucleotides or amino acid residues reached. For identifying whether a nucleic acid or polypep- 
that are the same, when compared and aligned for maximum 49 tide is within the scope of the invention, the default param- 
correspondence, as measured using a sequence comparison eters of the BLAST programs are suitable. The BLASTN 
algorithm such as those described below for example, or by program (for nucleotide sequences) uses as defaults a word 
visual inspection. length (W) of 11, an expectation (E) of 10, M=5, N=-4, and 

The phrase "substantially identical," in the context of two a comparison of both strands. For amino acid sequences, the 

nucleic acids or polypeptides, refers to two or more 45 BLASTP program uses as defaults a word length (W) of 3, 

sequences or subsequences that have at least 75%, prefer- an expectation (E) of 10, and the BLOSUM62 scoring 

ably at least 85%, more preferably at least 90%, 95% or matrix. The TBl^TN program (using protein sequence tor 

higher nucleotide or amino acid residue identity, when nucleotide sequence) uses as defaults a word length (W) of 

compared and aligned for maximum correspondence, as 3, an expectation (E) of 10, and a BLOSUM 62 scoring 

measured using a sequence comparison algorithm such as 50 matrix, (see Henikoff & Henikoff, Pwc. Natl Acad. Sci. 

those described below for example, or by visual inspection. USA 89:10915 (1989)), 

Preferably, the substantial identity exists over a region of the In addition to calculating percent sequence identity, the 

sequences that is at least about 40-50 residues in length, BLAST algorithm also performs a statistical analysis of the 

preferably over a longer region than 50 amino acids, more similarity between two sequences (see, e.g., Karlin & 

preferably at least about 90-100 residues, and most prefer- 55 Altschul, Proc. Nat'L Acad, Sci. USA 90:5873-5787 

ably the sequences are substantially identical over the full (1993)). One mefisure of similarity provided by the BLAST 

length of the sequences being compared, such as the coding algorithm is the smallest sum probability (P(N))» which 

region of a nucleotide for example. provides an indication of the probability by which a match 

For sequence comparison, typically one sequence acts as between two nucleotide or amino acid sequences would 

a relcrence sequence, to which test sequences are compared. 60 occur by chance. For example, a nucleic acid is considered 

When using a sequence comparison algorithm, test and similar to a reference sequence if the smallest sum prob- 

reference sequences are input into a computer, subsequence ability in a comparison of the test nucleic acid to the 

coordinates are designated, if necessary, and sequence algo- reference nucleic acid is less than about 0.1, more preferably 

rithm program parameters are designated. The sequence less than about 0.01, and most preferably less than about 

comparison algorithm then calculates the percent sequence 65 0.001. 

identity for the test sequencc(s) relative to the reference Another indication that two nucleic acid sequences are 

sequence, based on the designated program parameters. substantially identical is that the two molecules hybridize to 
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each other under stringent conditions. ^'Bind(s) substan- 
tially" refers to complementary hybridization between a 
probe nucleic acid and a target nucleic acid and embraces 
minor mismatches that can be accommodated by reducing 
the stringency of the hybridization media to achieve the 
desired detection of the target polynucleotide sequence. The 
phrase "hybridizing specifically to", refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular 
nucleotide sequence under stringent conditions when that 
sequence is present in a complex mixture (e.g., total cellular) 
DNA or RNA. 

The term "stringent conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, but 
to no other sequences. Stringent conditions are sequence- 
dependent and will be different in different circumstances. 
Longer sequences hybridize specifically at higher tempera- 
tures. Generally, stringent conditions are selected to be about 
5^ C. lower than the thermal melting point (Tm) for the 
specific sequence at a defined ionic strength and pH. ThG Tm 
is the temperature (under defined ionic strength, pH, and 
nucleic acid concentration) at which 50% of the probes 
complementary to the target sequence hybridize to the target 
sequence at equilibrium. (As the target sequences are gen- 
erally present in excess, at Tm, 50% of the probes arc 
occupied at equilibrium). Typically, stringent conditions will 
be those in which the salt concentration is less than about 1.0 
M Na ion, typically about 0.01 to 1.0 M Na ion concentra- 
tion (or other salts) at pH 7.0 to 8.3 and the temperature is 
at least about 30*" C. for short probes (e.g., 10 to 50 
nucleotides) and at least about 60® C. for long probes (e.g., 
greater than 50 nucleotides). Stringent conditions can also 
be achieved with the addition of destabilizing agents such as 
formamide. 

A further indication that two nucleic acid sequences or 
polypeptides are substantially identical is that the polypep- 
tide encoded by the first nucleic acid is immunologically 
cross reactive with the polypeptide encoded by the second 
nucleic acid, as described below. The phrases "specifically 
binds to a protein" or "specifically immunore active with," 
when referring to an antibody refers to a binding reaction 
which is determinative of the presence of the protein in the 
presence of a heterogeneous population of proteins and other 
biologies. Thus, under designated immunoassay conditions, 
a specified antibody binds preferentially to a particular 
protein and does not bind in a significant amount to other 
proteins present in the sample. Specific binding to a protein 
under such conditions requires an antibody that is selected 
for its specificity for a particular protein. A variety of 
immunoassay formats may be used to select antibodies 
specifically immunoreactive with a particular protein. For 
example, solid-phase ELISA immunoassays are routinely 
used to select monoclonal antibodies specifically immunore- 
active with a protein. See, e.g., Harlow and Lane (1988) 
AntibodieSj A Laboratory Manual, Cold Spring Harbor 
Publications, New York, for a description of immunoassay 
formats and conditions that can be used to determine specific 
immunoreactivity. 

"Conservatively modified variations" of a particular poly- 
nucleotide sequence refers to those polynucleotides that 
encode identical or essentially identical amino acid 
sequences, or where the polynucleotide does not encode an 
amino acid sequence, to essentially identical sequences. 
Because of the degeneracy of the genetic code, a large 
number of functionally identical nucleic acids encode any 
given polypeptide. For instance, the codons CGU, CGC, 
CGA, CGG, AGA, and AGG all encode the amino acid 
aiginine. Thus, at every position where an arginine is 
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Specified by a codon, the codon can be altered to any of the 
corresponding codons described without altering the 
encoded polypeptide. Such nucleic acid variations are 

"silent variations," which are one species of "conservatively 

5 modified variations." Every polynucleotide sequence 
described herein which encodes a polypeptide also describes 
every possible silent variation, except where otherwise 
noted. One of skill will recognize that each codon in a 
nucleic acid (except AUG, which is ordinarily the only 

10 codon for methionine) can be modified to yield a function- 
ally identical molecule by standard techniques. Accordingly, 
each "silent variation" of a nucleic acid which encodes a 
polypeptide is implicit in each described sequence. 
A polypeptide is typically substandally identical to a 

15 second polypeptide, for example, where the two peptides 
differ only by conservative substitutions. A "conservative 
substitution," when describing a protein, refers to a change 
in the amino acid composition of the protein that does not 
substantially alter the protein's activity. Thus, "conserva- 

20 lively modified variations" of a particular amino acid 
sequence refers to amino acid substitutions of those amino 
acids that are not critical for protein activity or substitution 
of amino acids with other amino acids having similar 
properties (e.g., acidic, basic, positively or negatively 

25 charged, polar or non-polar, etc.) such that the substitutions 
of even critical amino acids do not substantially alter activ- 
ity. Conservative substitution tables providing functionally 
similar amino acids are well-known in the art. See, e.g., 
Creighton (1984) Proteins, W.H. Freeman and Company. In 

^0 addition, individual substitutions, deletions or additions 
which alter, add or delete a single amino acid or a small 
percentage of amino acids in an encoded sequence are also 
"conservatively modified variations." 

The term "naturally-occurring" as applied to an object 
refers to the fact that an object can be found in nature. For 
example, a polypeptide or polynucleotide sequence that is 
present in an organism that can be isolated from a source in 
nature and which has not been intentionally modified by 
humans in the laboratory is naturally-occurring. 

The term "antibody" refers to a protein consisting of one 
or more polypeptides substantially encoded by immunoglo- 
bulin genes or fragments of immunoglobulin genes. The 
recognized immunoglobulin genes include the kappa, 
lambda, alpha, gamma, delta, epsilon and mu constant 
region genes, as well as myriad immunoglobulin variable 
region genes. Light chains are classified as either kappa or 
lambda. Heavy chains are classified as gamma, mu, alpha, 
delta, or epsilon, which in turn define the immunoglobuhn 
classes, IgG, IgM, IgA, IgD and IgE, respectively. 

A typical immunoglobulin (antibody) structural unit com- 
prises a tetramcr. Each tetramcr is composed of two identical 
pairs of polypeptide chains, each pair having one "light" 
(about 25 kD) and one "heavy" chain (about 50-70 kD). The 

55 N-terminus of each chain defines a variable region of about 
100 to 110 or more amino acids primarily responsible for 
antigen recognition. The terms variable fight chain (VL) and 
variable heavy chain (VH) refer to these light and heavy 
chains respectively. 

60 Antibodies exist as intact immunoglobulins or as a num- 
ber of well -characterized fragments produced by digestion 
with various peptidases. Thus, for example, pepsin digests 
an antibody below the disulfide linkages in the hinge region 
to produce F(ab) 2, a dimer of Fab which itself is a light 

65 chain joined to VH-CHl by a disulfide bond. The F(ab)'2 
may be reduced under mild conditions to break the disulfide 
linkage in the hinge region thereby converting the (Fab')2 
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dimer into an Fab' monomer. The Fab' monomer is essen- DETAILED DESCRIPTION 

tially an Fab with part of the hinge region (see, Fundamental invention provides a novel purified protein that the 

Immunology, W. E. Paul, ed.. Raven Press, N.Y. (1993), for present inventors call PX3.101 which is effective in treating 

a more detailed description of other antibody fragments). a variety of diseases, especially inflammatory diseases such 
While various antibody fragments are defined in terms of the 5 as rheumatoid arthritis. Also provided are nucleic acids 

digestion of an intact antibody, one of skill will appreciate which encode PX3.101, as well as expression cassettes, 

that such Fab' fragments may be synthesized de novo either expression vectors and cells containing same for use in 

chemically or by utilizing recombinant DNA methodology. producing PX3.101 poly-peptide and fragments thereof via 

Thus, the term antibody, as used herein also includes anti- recombinant methods. The present invention further pro- 
body fragments either produced by the modification of lo vides antibodies which specifically bind to the proteins, 

whole antibodies or synthesized de novo using recombinant Pharmaceutical compositions containing the proteins of the 

DNA methodologies. Preferred antibodies include single invention and methods for treating various diseases using 

chain amibodies, more preferably single chain Fv (scFv) such pharmaceutical compositions are also provided The 

antibodies in which a variable heavy and a variable light fnvention further provides methods for inhibiting the bind- 
chain are joined together (directly or through a peptide i5 ingbe^tweenchemokines and their receptors and methods f 

I V ^ r ^- 1 mhibitmg certain enzymes associated with vanous inflam- 

linker) to form a contmuous polypeptide. matory cHseases 

A single chain Fv ("scFv" or "scFv") polypeptide is a j^, ^^dition to being useful in treating various diseases 

covalently Imked VH::VL heterodimer which may be g^^h as inflammatory diseases, the protein provided by the 

expressed from a nucleic acid including VH- and present invention is useful in studying interactions between 
VL-encoding sequences eitherjoined directly orjoined by a 20 ^.^emokines and receptors therefor and for kinetic and 

peptide-encoding linker. Huston, et al. P«)c. AT^i/. Ac^irf. 5c/. inhibition studies involving enzymes such as 

USA, 85:5879-5883 (1988). A number of structures for cyclooxygenases, phosphohpases, and proteases that have 

converting the naturally aggregated — ^but chemically sepa- been implicated in various inflammatory diseases. The 

rated light and heavy polypeptide chains from an antibody V nucleotide and peptide sequences provided by the present 
region into an scFv molecule which will fold into a three 25 invention is also useful to generate primers and/or probes to 

dimensional strucUire substantially similar to the structure of screen for PX3.101 homologues in different species, par- 

an antigen-binding site. See, e.g. U.S. Pat. Nos. 5,091,513 ticularly in human, 

and 5,132,405 and 4,956,778. ^ Proteins 

An "antigen-binding site" or "binding portion" refers to ^'f ^ ,f "'^'^^^^^^ present invention provides a 

the part of an immunoglobulin molecule that participates in substantially pure PX3. 101 polypeptide isolated from natu- 

antigen binding, llie antigen binding site is formed by f f ^^^^^^^^ ^'^^f'f according to recombinant 

amino acid residuesof the N-terminal variable ("V") regions ^"^^^ .^'^^^^^ by chemical synthesis and/or 

of the heavy ("H") and light ("L") chains. Three highly ' •'^''n^!,"! ""l ^^^^^^^^^"^ methods and chemical 

divergent stretches within the V regions of the heavy and ^Y^^^^^^^- PX3.101 polypeptide exemplified by the ammo 

fight chains are referred to as "hypervariable regions" which ^^'f ^f'""^ ^^oxyn m FIG^ 3A ^"d SEQ ID N0:2. If 

are interposed between more conserved flanking stretches f ^^^^^ ^^^"^ natural sources, PX3.101is preferably isolated 

known ^ "framework regions" or "FRs". Thus, the term i^'^^' particularly from honey bee venom. Full- 

«i7n« r» » ^„ fu„« ^11.. length PX3.101 has a molecular weight of approximately 

FR refers to amino acid sequences that are naturally found ^ , , , , ^ • , ^ 

between and adjacent to hypervariable regions in immuno- J^^^J^^J""^ ^""^ ^"^J, characterized by five 

globulins. In an anUbody molecule, the three hypervariable Gly-GlyXaa repeals (Gly=Glycine and Xaa=any amino 

regions of a light chain and the three hypervariable regions ''"d; sometimes simply referred to as GGX) at the ammo- 

of a heavy chain are disposed relative to each other in three terminus and a cysteine-nch motif at the carboxy terminus, 

dimensional space to form an antigen binding "surface". ^s used herem the temi PX3.10]. includes the full- ength 

This surface mediates recognition and binding of the target niolecule as set forth in SEQ ID N0:2 and other polypep- 

amigen. The three hypervariable regions of each of the tides havmg a similar activity. The terra also includes the 

heavy and light chains are referred to as "complementarity P'"'^'" '"g '^e signal sequence (residues 1-19 of SEQ 

determining regions" or "CDR.s" and are characterized, for .^0^2). Also included, for example, are Polypeptides 

example by Kabat et al. Sequences of proteins of immuno- ''r^^ amino acid sequences consistmg of residues 22-92 

logical ilres,, 4lh ed. US. Dept Health and Human °' .fQ N0:2, residues 24-92 of SEQ ID N0:2 and 

Services, Public Health Services, Betbesda, Md. (1987). resjdues 26-92 of SEQ ID NO:2. 

^ . . , . „ r. , • , Th^ invention also mcludes an isolated polypeptide hav- 

Tlie term * antigenic determinant refers to the particular • ^^^^ ^^.^ sequence at least about 75% identical to 

chemical group of a molecule that confers antigenic speci- ^^.^^ ^^-^ ^c^,,^ncc as set forth in SEQ ID N0:2. More 

^^^'y* preferably, the polypeptide of the invention is at least 

The term "epitope" generally refers to that portion of an 55 80^5% identical, still more preferably at least 90% or 95% 

antigen that interacts with an antibody. More i^ecifically, the identical to the amino acid sequence of SEQ ID N0:2. The 

term epitope includes any protein determinant capable of region of similarity between PX3.101 and a polypeptide of 

specific binding to an immunoglobulin or T-cell receptor. interest typically extends over a region of at least 40 amino 

Specific binding exists when the dissociation constant for acids in length, more preferably over a longer region than 40 

antibody binding to an antigen is ^1 /iM, preferably ^100 amino acids such as 50, 60, 70 or 80 amino acids, and most 

nM and most preferably ^1 nM. Epitopic determinants preferably over the full length of the polypeptide. One 

usually consist of chemically active surface groupings of example of an algorithm that is useful for comparing a 

molecules such as amino acids and typically have specific polypeptide to the amino acid sequence of PX3.101 is the 

three dimensional structural characteristics, as well as spe- BLASTP algorithm; suitable parameters include a word 
cific charge characteristics. 55 length (W) of 3, and a BLOSUM62 scoring matrix. 

The term "patient" includes human and veterinary sub- Besides substantially full-length polypeptides, the present 

jects. invention provides for biologically active fragments of the 
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polypeptides. Biological activity may include effectiveness multiple cysteine residues. As used herein, a signal peptide 

of the polypeptide in alleviating the symptoms of various or sequence is a sequence which is capable of mediating the 

inflammatory diseases (for example, rheumatoid arthritis), transport of a polypeptide to the cell surface or exterior of 

and/or inhibiting the binding of chemokines (e.g., IL8) with intracellular membranes. The polypeptide is typically at 

receptors therefor, and/or inhibiting enzymes associated 5 least 50, 60, 70, 80 or 90 amino acids long, or any of the 

with inflammatory diseases (such proleias include, by way lengths therebetween. ITie polypeptide generally is no 

of illustration and not limitation, cyclooxygenases, longer than 150, 140, 130, 120, 110 or 100 amino acids, or 

phospholipases, lipooxygenases, and proteases such as any length therebetween. The segment containing multiple 

trypsin and cathepsin G). Other examples of significant GGX repeats typically contains at least 3 or 4 repeats, and 

biological activity include antibody binding (e.g., the frag- lo in other instances contains 5 repeats, although more repeats 

ment competes with a full-length PX3.101 as set forth in arc possible. The number of cysteines in the C-tcrminal 

SEQ ID N0:2) and immunogcnicity (i.e., possession of segment is typically at least 5, but may be 6, 7, 8, 9, 10, 11 

epitopes that stimulate B- or T-cell responses against the or 12. More cysteines than this may also be included within 

fragment). this segment. In one particular polypeptide, the polypeptide 

The invention further provides a subsequence which 15 includes a signal sequence, a segment containing 5 GGX 

ordinarily comprises at least 5 contiguous amino acids, repeats and a C-terminal segment which includes 10 cys- 

typically at least 6 or 7 contiguous amino acids, more teines. 

typically 8 or 9 contiguous amino acids, usually at least 10, 11. Nucleic Acids 

11 or 12 contiguous amino acids, preferably at least 13 or 14 ITie present invention further provides isolated and/or 

contiguous amino acids, more preferably at least 16 con- 20 recombinant nucleic acids that encode the entire PX3.101 

tiguous amino acids, and most preferably at least 20, 40, 60 protein (SEQ ID N0:2) or subsequences thereof which have 

or 80 contiguous amino acids. Other examples of subse- PX3.101 activity. The nucleic acids of the invention can 

quences provided by the invention are amino acid sequences include naturally occurring, synthetic, and intentionally 

wherein 1 to 10 amino acids are removed from the manipulated polynucleotide sequences (e.g., site directed 

N-terminal end of PX3. 101 (i.e., residues 1-10 of SEQ ID 25 mutagenesis or use of alternate promoters for RNA 

N0:2). Examples of such polypeptides are listed in Table II transcription). The polynucleotide sequence for PX3.101 

below, wherein 2, 4 or 6 amino acids are missing from the includes antisense sequences. The nucleic acids of the 

N-terminal end of full-length PX3.101. invention also include sequences that are degenerate as a 

Polypeptides of the invention also include particular result of the degeneracy of the genetic code, 

regions or domains of the amino acid sequence as set forth 30 The polynucleotide encoding PX3.101 includes the nucle- 

in SEQ ID N0:2. For example, polypeptides of the invention otide sequence as set forth in SEQ ID N0:1 and nucleic acid 

include the signal region (from residue 1 lo 19 of SEQ ID sequences complementary to that sequence. Also included in 

N0:2), a region containing GGX repeats (from residue 20 to the invention are subsequences of the above -described 

34of SEQ ID N0:2; also referred to as the GGX protein or nucleic acid sequences. vSuch subsequences include, for 

peptide) and a cysteine rich region at the C-terminus (from 35 example, the coding region of SEQ ID N0:1 (nucleotides 74 

residue 35 to 92 of SEQ ID N0:2) which is characterized by to 349), as well as subsequences that are at least 17, 18, 19, 

a specific cysteine pattern CXCXXG (C=Cystcinc, 20, 21, 22, 23, 24 or 25 nucleotides in length and which 

G-Glycinc, and X=any amino acid). hybridize specificaUy to a nucleic acid which encodes 

The polypeptides of the invention are typically encoded PX3.101. Thus, the invention also includes an isolated 

by nucleotide sequences that are substantially identical with 40 nucleic acid molecule comprising a nucleotide sequence 

the nucleotide sequence set forth in SEQ ID N0:1 and selected from the group consisting of (a) a deoxyribonucle- 

shown in FIG. 3A. The nucleotides encoding the polypep- otide sequence complementary to nucleotides 74 to 349 of 

tides of the invention will also typically hybridize to the SEQ ID N0:1; (b) a ribonucleotide sequence complemen- 

polynucleotide sequence set forth in SEQ ID N0:1. tary to nucleotides 74 to 349 of SEQ ID NO:l; (c) a 

Often the polypeptides of the invention will share at least 45 nucleotide sequence complementary to the deoxyribonucle- 

ohe antigenic determinant in common with the amino acid otide sequence of (a) or to the ribonucleotide sequence of 

sequence set forth in SEQ ID N0:2. 'the existence of such (b); (d) a nucleotide sequence of at least 23 consecutive 

a common determinant is evidenced by cross-reactivity of nucleotides capable of hybridizing to nucleotides 74 to 349 

the variant protein with any antibody prepared against of SEQ ID N0:1; and (e) a nucleotide sequence capable of 

PX3.101 polypeptide. Cross-reactivity may be tested using so hybridizing to a nucleotide sequence of (d). 

polyclonal sera against PX3.101, but can also be tested using The invention further provides nucleic acid molecules that 

one or more monoclonal antibodies against PX3.101. include a polynucleotide sequence that encodes a polypcp- 

The invention further includes the polypeptides described tide having an amino acid sequence that is substantially 

herein in which the polypeptide includes modified polypep- identical to the amino acid sequence set forth in SEQ ID 

tide backbones. Illustrative examples of such modifications 55 N0:2. For example, the invention includes a polynucleotide 

include chemical derivatizations of polypeptides, such as sequence that encodes a polypeptide having an amino 

acetylations and carboxylations. Modifications also include sequence that is at least 75% identical to the amino acid 

glycosylation modifications and processing variants of a sequence as set forth in SEQ ID N0:2 over a region of at 

typical polypeptide. Such processing steps specifically least 40 amino acids in length. More preferably, the polypep- 

include enzymatic modification.s, such as ubiquitini/ation 60 tide encoded by the nucleic acid of the invention are at least 

and phosphorylation. See, e.g., Hershko & Ciechanover, 80 to 85% identical lo the amino acid sequence of SEQ ID 

Ann. Rev. Biochem, 51:335-364 (1982). N0:2, and still more preferably at least 90% or 95% 

The polypeptides provided by the invention also include identical to the amino acid sequence of SEQ ID N0:2 over 

isolated polypeptides comprising a signal peptide, a segment a region of at least 40 amino acids. In some instances, the 

containing multiple GGX repeats (where G is glycine and N 65 region of percent identity extends over a region of 50, 60, 70 

is any amino acid), and a C-terminal segment extending or 80 amino acids, and more preferably over the full length 

from the last GGX repeat to the C-terminus which contains of the amino acid sequence set forth in SEQ ID N0:2. 
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Sequence oomparisoos of the protein encoded by the nucleic 
acids of the invention can be performed visually or with a 
comparison algorithm. One such algorithm is the BLASTP 
algorithm using a wordlength (W) of 3 and the BLOSUM62 
scoring matrix. 

11ie polynucleotide sequences are typically substantially 
identical to a polynucleotide sequence such as residues 74 to 
349 of SEQ ID NO:l. For example, the invention includes 
polynucleotide sequences that are at least about 75% iden- 
tical to the nucleic acid SEQ ID N0:1 over a region of at 
least about 50 nucleotides in length. More preferably, the 
nucleic acids of the invention are at least 80-85% identical 
to the nucleic acid sequence shown in SEQ ID N0:1, and 
still more preferably at least 90-95% identical to the nucleic 
acid sequence of SEQ ID N0:1 over a region of at least 50 
amino acids. In some instances, the region of percent 
identity extends over a longer region than 50 nucleotides, 
such as 75, 100, 125, 150, 175, 200, 225 or 250 nucleotides, 
or over the full length of the encoding region (residues 74 to 
349 of SEQ ID N0:1). 

To identify nucleic acids of the invention, one can employ 
a nucleotide sequence comparison algorithm such as are 
known to those of skill in the art. For example, one can use 
the BLASTN algorithm. Suitable parameters for use in 
BLASTN are wordlength (W) of 11, M-5 and N«-4. One 
example of a nucleic acid of the invention includes a 
polynucleotide sequence as set forth in SEQ ID N0:1, 
especially as obtained from an insect such as a bee. 

Alternatively, one can identify a nucleic acid of the 
invention by hybridizing, under stringent conditions, the 
nucleic acid of interest to a nucleic acid that includes a 
polynucleotide sequence of SEQ ID N0:1. The invention 
also includes nucleic acids which encode a polypeptide 
which is immunologically cross reactive with PX3.101 or 
subsequences thereof. 

Nucleic acid sequences of the present invention can be 
obtained by any suitable method known in the art, including, 
for example, 1) hybridization of genomic or cDNA libraries 
with probes to detect homologous nucleotide sequences; 2) 
antibody screening of expression libraries to detect cloned 
DNA fragments with shared structural features; 3) various 
amplification procedures such as polymerase chain reaction 
(PCR) using primers capable of annealing to the nucleic acid 
of interest; 4) direct chemical synthesis. 

In one embodiment, a nucleic acid of the invention is 
isolated by routine cloning methods. Ilie nucleotide 
sequence of a gene or cDN A encoding PX3.101 as provided 
herein, is used to provide probes that specifically hybridize 
to a PX3.101 cDNA in a cDNA library, a PX3.101 gene in 
a genomic DNA sample, or to a PX3.101 mRNA in a total 
RNA sample (e.g., in a Southern or Northern blot). Once the 
target nucleic acid is identified, it can be isolated according 
to standard methods known to those of skill in the art. 

The desired nucleic acids can also be cloned using well- 
known amplification techniques. Examples of protocols 
sufficient to direct persons of skill through in vitro ampli- 
fication methods, including the polymerase chain reaction 
(PGR) the ligase chain reaction (LCR), Qp-replicase ampli- 
fication and other RNA polymerase mediated techniques, are 
found in Berger, Sambrook, and Ausubel, as well as Mullis 
et al. (1987) U.S. Pat. No. 4,683,202; PCR Protocols A 
Guide to Methods and Applications (Innis et al. eds) Aca- 
demic Press Inc. San Diego, Calif. (1990) (Innis); Arnheim 
& Uvinson (Oct. 1, 1990) C&£N 36^7; The Journal Of 
NIH Research (1991) 3: 81-94; (Kwoh et al (1989) Proc, 
Natl Acad. Set. USA 86: 1173; Guatelli et al. (1990) Proc. 
Natl. Acad. ScL USA 87: 1874; Lomell et al. (1989)7. Clin. 
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Che/fL 35: 1826; Landegren et al. (1988) Science 241: 
1077-1080; Van Brunt (1990) Biotechnology 8: 291-294; 
Wu and Wallace (1989) Gene 4: 560; and Barringer et al. 
(1990) Gene 89: 117. Improved methods of cloning in vitro 

5 amplified nucleic acids are described in Wallace et al., U.S. 
Pal. No. 5,426,039. Suitable primers for use in the ampli- 
fication of the nucleic acids of the invention include, for 
example, forward primer ASEQIO: 5' AAGGATCCA- 
CAGTGCAACGTAAGTTC 3' (SEQ ID N0:3), forward 

10 primer ASEQll: 5' AAGGATCCGGAGGATTTGGAG- 
GATTTGGAGGATTTGGAGGACTTGGAGGACGTGG 
3' (SEQ ID N0:4), reverse primer ASEQ13: 5'ACT- 
GATAAAATAATAAC 3' (SEQ ID N0:5), reverse primer 
ASEQ14: 5' ATGAATGATAAAAEAC 3' (SEQ ID N0:6), 

15 reverse primer ASEQ15: 5'TTATAAAAGTCATCCGC 
3'(SEQ ID NO:7). 

As an alternative to cloning a nucleic acid, a suitable 
nucleic acid can be chemically synthesized. Direct chemical 
synthesis methods include, for example, the phosphotriester 

20 method of Narang et al. (1979) MetL Enzymol 68: 90-99; 
the phosphodiester method of Brown et al. (1979) Meth. 
Enzymol. 68: 109-151; the diethylphosphoramidite method 
of Beaucage et al. (1981) Tetra. Lett., 22: 1859-1862; and 
the solid support method of U.S. Pal. No. 4,458,066. Chemi- 

25 cal synthesis produces a single stranded oligonucleotide. 
This can be converted into double stranded DNA by hybrid- 
ization with a complementary sequence, or by polymeriza- 
tion with a DNA polymerase using the single strand as a 
template. One of skill would recognize that while chemical 

30 synthesis of DNA is often limited to sequences of about 100 
bases, longer sequences may be obtained by the ligation of 
shorter sequences. Alternatively, subsequences may be 
cloned and the appropriate subsequences cleaved using 
appropriate restriction enzymes. The fragments can then be 
ligated to produce the desired DNA sequence. 

In some embodiments, it may be desirable to modify the 
nucleic acids of the invention. One of skill will recognize 
many ways of generating alterations in a given nucleic acid 
construct. Such well-known methods include site-directed 

^ mutagenesis, PCR amplification using degenerate 
oligonucleotides, expostu'e of cells containing the nucleic 
acid to mutagenic agents or radiation, chemical synthesis of 
a desired oligonucleotide (e.g., in conjunction with ligation 
and/or cloning to generate large nucleic acids) and other 
well-known techniques. See, e.g., Giliman and Smith (1979) 
Gene 8:81-97, Roberts et al. (1987) Nature 328: 731-734. 
III. Methods for Preparing or Isolating Protein 
A. Recombinant Technologies 

50 1. General 

The nucleotide and amino acid sequences of PX3.101 as 
shown in SEQ ID N0:1 and SEQ ID N0:2, respectively, and 
corresponding sequences for other variants as described 
above, allow for production of fiill-length PX3.101 polypep- 

55 tide and fragments thereof. Such polypeptides can be pro- 
duced in prokaryotic or eukaryotic host cells by expression 
of polynucleotides encoding PX3.101 or fragments thereof, 
llie cloned DNA sequences are expressed in hosts after the 
sequences have been operably linked to an expression 

60 control sequence in an expression vector. Expression vectors 
are typically replicable in the host organisms either as 
episomes or as an integral part of the host chromosomal 
DNA. Commonly, expression vectors contain selection 
markers, e.g., tetracycline resistance or hygromycin 

65 resistance, to permit detection and/or selection of those cells 
transformed with the desired DNA sequences (see, e.g., U.S. 
Pat. No. 4,704,362). 
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2. Expression Cassettes and Host Cells for Expressing 
Polypeptides 

Typically, the polynucleotide that encodes a polypeptide 
of the invention is placed under the control of a promoter 
that is functional in the desired host cell to produce relatively 5 
large quantities of a polypeptide of the invention. An 
extremely wide variety of promoters are well-known, and 
can be used in the expression vectors of the invention, 
depending on the particular application. Ordinarily, the 
promoter selected depends upon the cell in which the lo 
promoter is to be active. Other expression control sequences 
such as ribosomc binding sites, transcription termination 
sites and the like are also optionally included. Constructs 
that include one or more of these control sequences are 
termed "expression cassettes." Accordingly, the invention 15 
provides expression cassettes into which the nucleic acids 
that encode the polypeptides described herein are incorpo- 
rated for high level expression in a desired host cell. 

In a preferred embodiment, the expression cassettes are 
useful for expression of the polypeptides of the invention in 20 
prokaryotic host cells. Commonly used prokaryotic control 
sequences, which are defined herein to include promoters for 
transcription initiation, optionally with an operator, along 
with ribosome binding site sequences, include such com- 
monly used promoters as the beta-lactamasc (penicillinase) 2S 
and lactose (lac) promoter systems (Change et al. (1977) 
Nature 198: 1056), the tryptophan (trp) promoter system 
(Goeddel et al. (1980) Nucleic Acids Res, 8: 4057), the tac 
promoter (DeBoer et al. (1983) Proc. Natl Acad. Set. USA, 
80:21-25); and the lambda-derived promoter and N-gene 30 
ribosome binding site (Shimatake et al. (1981) Nature 292: 
128). ITie particular promoter system is not critical to the 
invention, any available promoter that functions in prokary- 
otes can be used. 

For expression of polypeptides in prokaryotic cells other 35 
than £. colij a promoter that functions in the particular 
prokaryotic species is required. Such promoters can be 
obtained from genes that have been cloned from the species, 
or heterologous promoters can be used. For example, the 
hybrid trp4ac promoter functions in Bacillus in addition to 40 
E. coli. 

For expression of the polypeptides in yeast, convenient 
promoters include GALl-10 (Johnson and Davies (1984) 
MoL Cell Biol 4:1440-1448) ADH2 (RusseU et al. (1983) 
7. Biol Chem. 258:2674-2682), PH05 (EMBO J. (1982) 45 
6:675-680), and MFa (Herekowitz and Oshima (1982) in 
The Molecular Biology of the Yeast Saccharomyces (eds. 
Strathern, Jones, and Broach) Cold Spring Harbor Lab., 
Cold Spring Harbor, N.Y, pp. 181-209). Another suitable 
promoter for use in yeast is the ADH2/GAPDH hybrid 50 
promoter as described in Cousens ct al,, Gene 61:265-275 
(1987). Other promoters suitable for use in eukaryotic host 
cells are well-known to those of skill in the art. 

For expression of the polypeptides in mammalian cells, 
convenient promoters include CM V promoter (Miller, et al., 55 
BioTeclmiques 7:980), SV40 promoter (de la Luma, et 
al.,(1998) Gene 62:121), RSV promoter (Yates, et al, (1985) 
Nature 313:812), MMTV promoter (Lee, et al,(1981) Nature 
294:228). 

For expression of the polypeptides in insect cells, the 6n 
convenient promoter is from the baculovirus Autograplia 
Californica nuclear polyhedrosis vims (NcMNPV) (Kilts, et 
al., (1993) Nucleic Acids Research 18:5667). 

Either constitutive or regulated promoters can be used in 
the present invention. Regulated promoters can be advanta- 65 
geous because the host cells can be grown to high densities 
before expression of the polypeptides is induced. High level 
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expression of heterologous proteins slows cell growth in 
some situations. An inducible promoter is a promoter that 
directs expression of a gene where the level of expression is 
alterable by environmental or developmental factors such as, 
for example, temperature, pH, anaerobic or aerobic 
conditions, ligjit, transcription factors and chemicals. Such 
promoters are referred to herein as "inducible" promoters, 
and allow one to control the timing of expression of the 
polypeptide. For E, coli and other bacterial host cells, 
inducible promoters are known to those of skill in the art. 
These include, for example, the lac promoter, the bacte- 
riophage lambda P^ promoter, the hybrid trp-lac promoter 
(Amann et al. (1983) Gene 25: 167; de Boer et al. (1983) 
Proc. Nat'L Acad. Sci. USA 80: 21), and the bacteriophage 
T7 promoter (Studier et al. (1986)/. Mol Biol; Tabor et al. 
(1985) Proc. Nat'l Acad. ScL USA 82: 1074-^). These 
promoters and their use are discussed in Sambrook et al., 
supra. A particularly preferred inducible promoter for 
expression in prokaryotes is a dual promoter that includes a 
tac promoter component linked to a promoter component 
obtained from a gene or genes that encode enzymes involved 
in galactose metabolism (e.g., a promoter from a UDP 
galactose 4-epimerase gene (ga/1)). The dual tac-gal 
promoter, which is described in PCT Patent ^plication 
Publ. No. WO98/20111, provides a level of expression that 
is greater than that provided by cither promoter alone. 

Inducible promoters for other organisms are also well- 
known to those of skill in the art. These include, for 
example, the arabinose promoter, the lacZ promoter, the 
metallothionein promoter, and the heat shock promoter, as 
well as many others. 

A ribosome binding site (RBS) is conveniently included 
in the expression cassettes of the invention that are intended 
for use in prokaryotic host cells. An RBS in E. coli, for 
example, consists of a nucleotide sequence 3-9 nucleotides 
in length located 3-11 nucleotides upstream of the initiation 
codon (Shine and Dalgarao (1975) Nature 254: 34; Steitz, In 
Biological regulation and development: Gene expression 
(ed. R. F. Goldberger), vol. 1, p. 349, 1979, Plenum 
Publishing, NY). 

Selectable markers are often incorporated into the expres- 
sion vectors used to express the polynucleotides of the 
invention. These genes can encode a gene product, such as 
a protein, necessary for the survival or growth of trans- 
formed host cells grown in a selective culture medium. Host 
cells not transformed with the vector containing the selec- 
tion gene will not survive in the culture medium. Typical 
selection genes encode proteins that confer resistance to 
antibiotics or other toxins, such as ampicillin, neomycin, 
kanamycin, chloramphenicol, or tetracycline. Alternatively, 
selectable markers may encode proteins that complement 
auxotrophic deficiencies or supply critical nutrients not 
available from complex media, e.g., the gene encoding 
D-alanine racemase for Bacilli. Often, the vector will have 
one selectable marker that is functional in, e.g., E. coli, or 
other cells in which the vector is replicated prior to being 
introduced into the host cell. A number of selectable markers 
are known to those of skill in the art and are described for 
instance in Sambrook et al., supra. A preferred selectable 
marker for use in bacterial cells is a kanamycin resistance 
marker (Vieira and Messing, Gene 19: 259 (1982)). Use of 
kanamycin selection is advantageous over, for example, 
ampicillin selection because ampicillin is quickly degraded 
by p-lactamase in culture medium, thus removing selective 
pressure and allowing the culture to become overgrown with 
cells that do not contain the vector. 

Construction of suitable vectors containing one or more of 
the above listed components employs standard ligation 
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techniques as described in the references cited above. Iso- 
lated plasmids or DNA fragments are cleaved, tailored, and 
re-ligaled in the form desired to generate the plasmids 
required. To confirm correct sequences in plasmids 
constructed, the plasmids can be analyzed by standard 5 
techniques such as by restriction endonuclease digestion, 
and/or sequencing according to known methods. A wide 
variety of cloning and in vitro amplification methods suit- 
able for the construction of recombinant nucleic acids are 
well-known to persons of skill. Examples of these tech- lo 
niques and instructions sufficient to direct persons of skill 
through many cloning exercises are found in Berger and 
Kimmel, Guide to Molecular Cloning Techniques, Methods 
in Enzymology, Volume 152, Academic Press, Inc., San 
Diego, Calif. (Berger); and Current Protocols in Molecular 15 
Biology, F. M. Ausubel el al., eds., Current Protocols, a joint 
venture between Greene Publishing Associates, Inc. and 
John Wiley & Sons. Inc., (1998 Supplement) (Ausubel). 

A variety of common vectors suitable for use as starting 
materials for constructing the expression vectors of the 20 
invention are well-known in the art. For cloning in bacteria, 
common vectors include pBR322 derived vectors such as 
pBLUESCRIPT™, pUC18/19, and X-phage derived vectors. 
In yeast, vectors which can be used include Yeast Integrating 
plasmids (e.g., YIp5) and Yeast Replicating plasmids (the 25 
YRp series plasmids) pYES series and pGPD-2 for example. 
Expression in mammalian cells can be achieved, for 
example, using a variety of commonly available plasmids, 
including pSV2, pBC12BI, and p91023, pCDNA series, 
pCMVl, pMAMneo, as well as lytic virus vectors (e.g., 30 
vaccinia virus, adeno virus), episomal virus vectors (e.g., 
bovine papillomavirus), and retroviral vectors (e.g., murine 
retroviruses). Expression in insect cells can be achieved 
using a variety of baculovirus vectors, including pFaslBacl , 
pFastBacHT series, pBluesBac4.5, pBluesBacHis series, 35 
pMelBac series, and pVL1392/1393, for example. 

Translation al coupling can be used to enhance expression. 
The strategy uses a short upstream open reading frame 
derived from a highly expressed gene native to the transla- 
tional system, which is placed downstream of the promoter, 40 
and a ribosome binding site followed after a few amino acid 
codons by a termination codon. Just prior to the termination 
codon is a second ribosome binding site, and following the 
termination codon is a start codon for the initiation of 
translation. The system dissolves secondary structure in the 45 
RNA, allowing for the efficient initiation of translation. See, 
Squires et. al. (1 988) J. BioL Chem. 263: 16297-16302. 

Polypeptides of the invention can be expressed in a 
variety of host cells, including £. coU, other bacterial hosts, 
yeast, and various higher eukaryotic cells such as the COS, 50 
CHO and HeLa cells lines and myeloma cell lines. The host 
cells can be mammalian cells, plant cells, insect cells or 
microorganisms, such as, for example, yeast cells, bacterial 
cells, or fungal cells. Examples of suitable host cells include 
Azotobacter sp. (e.g., A. vinelandii), Pseudomonas sp., 55 
Rhizobium sp., Erwinia sp., Escherichia sp. (e.g., E. coli). 
Bacillus, Pseudomonas, Proteus, Salmonella, Serratia, 
Shigella, Rhizobia, VitreosciUa, Paracoccus and Klebsiella 
sp., among many others. The cells can be of any of several 
genera, including Saccharomyces (e.g., S. cerevisiac\ Can- 60 
dida (e.g., C. ulilLs, C. parapsilasts, C. kntsei, C. versa tills, 
C. lipolytica, C. zeylanoides, C. guilliennondii, C. albicans, 
and C. humicola)^ Pichia (e.g., P. farinosa and P. ohmeri), 
Torulopsis (e.g. T. Candida, T. spha erica, T xylinus, T. 
famata, and T. versatilis), Dcbaryomyces (e.g., D. 65 
subglobosuSj D. cantarellii, D. globosus, D, hansenii, and D. 
japonicus)t Zygosaccharomyces (e.g., Z. rouxii and Z. 



bailii)y Kluyveromyces (e.g., K. marxianus), Hansenula 
(e.g., //. anomala and //. jadinii), and Brettanomyces (e.g., 
B. lambicus and B. anomalus). Examples of useful bacteria 
include, but are not limited to, Escherichia, Enterobacter, 
Azotobacter, Erwinia, Klebsielia. The commonly used insect 
cells to produce recombinant proteins are Sf9 cells (derived 
from Spodoptera frugiperda ovarian cells) and High Five 
cells (derived from Trichoplusia ni egg cell horaogenates; 
commercially available from Invitrogen). 

The expression vectors of the invention can be transferred 
into the chosen host cell by well-known methods such as 
calcium chloride transformation for E. coli and calcium 
phosphate treatment or electroporation for mammalian cells. 
Cells transformed by the plasmids can be selected by 
resistance to antibiotics conferred by genes contained on the 
plasmids, such as the amp, gpt, neo and hyg genes. 

Once expressed, the recombinant polypeptides can be 
purified according to standard procedures of the art, includ- 
ing ammonium sulfate precipitation, affinity columns, ion 
exchange and/or size exclusivity chromatography, gel elec- 
trophoresis and the like (see, generally, R. Scopes, Protein 
Purification, Springer- Verlag, N.Y. (1982), Deutscher, 
Methods in Enzymology Vol 182: Guide to Protein 
Purification., Academic Press, Inc. N.Y. (1990)). Substan- 
tially pure compositions of at least about 90 to 95% homo- 
geneity are preferred, and 98 to 99% or more homogeneity 
are most preferred. Once purified, partially or to homoge- 
neity as desired, the polypeptides may then be used (e.g., 
treatment of inflammatory diseases in pre-clinical or clinical 
studies). 

To facilitate purification of the polypeptides of the 
invention, the nucleic acids that encode the polypeptides can 
also include a coding sequence for an epitope or "lag" for 
which an affinity binding reagent is available. Examples of 
suitable epitopes include the myc and V-5 reporter genes; 
expression vectors useful for recombinant production of 
polypeptides having these epitopes arc commercially avail- 
able (e.g., Invitrogen (Carlsbad Calif.) vectors pcDNA3.1/ 
Myc-Iiis and pcDNA3.1A^5-His are suitable for expression 
in mammahan cells; Invitrogen (Carlsbad, Calif.) vectors 
pBlueBacHis and Gibco (Gaithersburg, Md.) v^ectors pFast- 
BacHT are suitable for expression in insect cells). Additional 
expression vectors suitable for attaching a tag to the proteins 
of the invention, and corresponding detection systems are 
known to those of skill in the art, and several are commer- 
cially available (e.g., FLAG" (Kodak, Rochester N.Y). 
Another example of a suitable tag is a polyhLstidine 
sequence, which is capable of binding to metal chelate 
affinity ligaads. Typically, six adjacent histidines are used, 
although one can use more or less than six. Suitable metal 
chelate affinity ligands that can serve as the binding moiety 
for a polyhistidine tag include nitrilo-tri-acctic acid (NTA) 
(Hochuli, E. (1990) "Purification of recombinant proteins 
with metal chelating adsorbents" In Genetic Engineering: 
Principles and Methods, J. K. Setlow, Ed., Plenum Press, 
NY; commercially available from Qiagen (Santa Qarita, 
Calif.)). 

Other haptens that are suitable for use as tags are known 
to those of skill in the art and are described, for example, in 
the Handbook of Fluorescent Probes and Research Chemi- 
cals (6th Ed., Molecular Probes, Inc., Eugene Oreg.), For 
example, dinitrophenol (DNP), digoxigenin, barbiturates 
(see, e.g., U.S. Pat. No. 5,414,085), and several types of 
fluorophores are useful as haptens, as are derivatives of these 
compounds. Kits arc commercially available for linking 
haptens and other moieties to proteins and other molecules. 
For example, where the hapten includes a thiol, a heterobi- 
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functional linker such as SMCC can be used to attach the tag (1989) and WO 90/07861, (incorporated by reference for all 

to lysine residues present on the capture reagent. purposes). Alternatively, one may isolate DNA sequences 

B. Naturally-Occurring Polypeptides which encode a human monoclonal antibody or a binding 
Naturally occurring polypeptides of the invention, includ- fragment thereof by screening a DNA library from human B 

ing full-length PX3.101 and fragments thereof can be iso- 5 cells according to the general protocol outhned by Huse et 

lated using conventional techniques such as affinity chro- al. Science 246:1275-1281 (1989) and then cloning and 

matography. For example, polyclonal or monoclonal amplifying the sequences which encode the antibody (or 

antibodies are raised against previously-purilied PX3.101 or binding fragment) of the desired specificity. The protocol 

fragments thereof and attached to a suitable aflinity column described by Huse is rendered more efficient in combination 

by well-known techniques. See, e.g., Hudson & Hay, Prac- lo with phage display technology. See, e.g., Dower et al., WO 

tical Immunology (Blackwell Scientific Publications, 91/17271 and McCafifcrty ct al., WO 92/01047 (each of 

Oxford, UK, 1980), Chapter 8 (incorporated herein by which is incorporated by reference for all purposes). Phage 

reference in its entirety for all purposes). Peptide fragments display technology can also be used to mutagenize CDR 

are generated from intact PX3.101 by chemical or enzymatic regions of antibodies previously shown to have aflfinity for 

cleavage methods which are known to those with skill in the 15 the peptides of the present invention. Antibodies having 

art. Example II also sets forth a method for purifying improved binding affinity are selected. 

PX3.101. The antibodies can be further purified, for example, by 

C. Other Methods binding to and elution from a support to which the polypcp- 
Alternatively, the polypeptides of the invention can be tide or a peptide to which the antibodies were raised is 

synthesized by chemical methods or produced by in vitro 20 bound. A variety of other techniques known in the art can 

translation systems using a polynucleotide template to direct also be used to purify polyclonal or monoclonal antibodies 

translation. Methods for chemical synthesis of polypeptides (see, e.g., Coligan, et ah. Unit 9, Current Protocols in 

and in vitro translation are well-known in the art, and are Immunology, Wiley Interscience, (1994), incorporated 

described further by Berger & Kimmel, Methods in herein by reference in its entirety). 

Enzymology, Volime 152, Guide to Molecular Cloning is jt is also possible to use the anti-idiotype technology lo 

Techniques, Academic Press, Inc., San Diego, Calif., 1987 produce monoclonal antibodies which mimic an epitope. For 

(incorporated herein by reference in its entirety for all example, an anti-idiotypic monoclonal antibody made to a 

purposes). first monoclonal antibody will have a binding domain in the 

IV. Antibodies hypervariable region which is the "image" of the epitope 

In another embodiment of the invention, antibodies that 30 bound by the first monoclonal antibody, 

are immunoreactive with PX3.101 polypeptide or fragments g Antibodies 

thereof are provided. The antibodies may be polyclonal antibodies of the invention are useful, for example, in 

antibodies dKtinct monoclonal antibodies or pooled mono- screening cDNA expression libraries and for identifying 

c ona antibod.es with different epitopic speciticties. Mono- containing cDNA inserts which ena>de structurallv- 

clonal antibodies are made from antigen-containing frag- 35 ^j^^^^^ immunocrossre active proteins. See, for example, 

ments of the protein by methods that are well-known m the ^ ^^^^ ^^^^ 84:8573^577 

art (see, e.g. Kohler, et al. Nature, 256:495, (1975); and ^^g^^j (incorporated herein by reference in its entirety for 

r.^'^lox^/'''?''? ^ ^^.^^'■^'^'•^ ^^'^"^^ aU purposes). Antibodies are also useful to identify and/or 

(C.S.H.P., NY, 1988), both of which are mcorporated herem immunocrossreactivc proteins that are structurally 

by reference m their entirety for aU purposes). 40 ^j^^^^ ^^^.^^ jqi or to fragments thereof used to 

A. Production of Antibodies generate the antibody. 

Antibodies that bind lo PX3.101 polypeptide or other . . ^ ,u ^ ^ 

. ^. , c ' *' u \j ■ * 4 V. Therapeutic Methods and Compositions 

polypeptides of the invention can be prepared using intact ^ ^ 

polypeptide or fragments containing small peptides of inter- General 

est as the immunizing antigen. For example, it may be 45 The present invention further provides pharmaceutical 

desirable to produce antibodies that specifically bind to the compositions comprised of the polypeptides of the present 

N- or C-terminal domains of PX3.101. The polypeptide used invention, including fuU-length PX3.101 and fragments 

to immunize an animal can be from natural sources, derived thereof. As explained more fully below in Example V, the 

from translated cDNA, or prepared by chemical synthesis pharmaceutical compositions of the invention arc useful in 

and can be conjugated with a carrier protein, if desired. 5U treating a variety of diseases in both human and veterinary 

Commonly used carriers include keyhole limpet hemocya- subjects. Diseases which can be treated with certain phar- 

nin (KLH), thyroglqbulin, bovine serum albumin (BSA), maceutical compositions of the inventions include a variety 

and tetanus toxoid. The coupled peptide is then used to of inflammatory diseases and autoimmune diseases, (e.g., 

immunize the animal (e.g., a mouse, a rat, or a rabbit). rheumatoid arthritis, multiple sclerosis, psoriasis, systemic 

Techniques for generation of human monoclonal antibod- 55 lupus erythematosus (SLE), Crohn's disease, scleroderma), 

ies have also been described but are generally more onerous metastatic cancers, and diseases associated with imbalances 

than murine techniques and not applicable to all antigens. in chemokine (e.g., 11,^, IL-10,IL-1, and TNF-a) production 

See, e.g., Larrick et al., U.S. Pat. No. 5,001,065, for review such as Alzheimer disease. Some compositions can also be 

(incorporated by reference for all purposes). An alternative used to treat pain, i.e., the composition can be used as an 

approach is the generation of humanized antibodies by 60 analgesic. 

linking the complementarity-determining regions or CDR Pharmaceutical compositioas of the invention are suitable 

regions (see, e.g., Kabat et al., "Sequences of Proteins of for use in a variety of drug delivery systems. Suitable 

Immunological Interest,'' U.S. Dept. of Health and Human formulations for use in the present invention are found in 

Services, (1987); and Chothia et al., J. Mol. Biol. Remington's Pharmaceutical Sciences, Mace Publishing 

196:901-917 (1987)) of non-human antibodies to human 65 Company, Philadelphia, Pa., 17th cd. (1985). For a brief 

constant regions by recombinant DNA techniques. See review of methods for drug delivery, see, Langer, Science 

Queen et al., Pwc. Natl Acad. Sci, USA 86:10029-10033 249:1527-1533 (1990). 
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B. Composition and Delivery complexed with molecules that enhance their in vivo 

The pharmaceutical compositions used for prophylactic attributes. Such molecules include, for example, 

or therapeutic treatment comprise an active therapeutic carbohydrates, polyamines, amino acids, other peptides, 

agent, for example, a PX3. 101 protein or fragments thereof, ions (e.g., sodium, potassium, calcium, magnesium, 

a PX3.101 receptor or fragments thereof, and antibodies and 5 manganese), and hpids. 

idiotypic antibodies thereto, and a variety of other compo- Further guidance regarding formulations that are suitable 
nents. Various subsequences of full-length FX3.101 can be for various types of administration can be found in /^em- 
used. For example, the polypeptide composition used in the ington's Pharmaceutical Sciences^ Mace Publishing 
animal studies described in Example V included peptides Company, Philadelphia, Pa., 17th ed. (1985). For a brief 
having amino acid sequences consisting of residues 20-92, lo review of methods for drag delivery, see, Langer, Science 
22-92, 24-92 and 26-92 of SEQ ID N0:2. 249:1527-1533 (1990). 

In some instances, the efficacy of treatment may be The compositions containing the polypeptides can be 

enhanced by using the pharmaceutical compositions of the administered for prophylactic and/or therapeutic treatments, 

present invention with other complementary compounds The polypeptide in the pharmaceutical composition typi- 

that are known to be effective in the treatment of various i5 cally is present in a therapeutic amount, which is an amount 

diseases, especially inflammatory diseases. For example, the sufficient to remedy a disease state or symptoms, particularly 

pharmaceutical compositions of the invention may also symptoms associated with inflammation, or otherwise 

include a compound effective in treatment of inflammatory prevent, hinder, retard, or reverse the progression of disease 

diseases, such as rheumatoid arthritis for example. Such or any other undesirable symptoms in any way whatsoever, 

compounds include ENBREL (manufactured by Immunex) 20 The concentration of the polypeptide in the pharmaceutical 

and INDOMETHACIN (manufactured by Merck), METH- composition can vary widely, i.e., from less than about 0.1 % 

OTREXATE (manufactured by My Ian and Roxane by weight, usually being at leas! about 1% by weight, to as 

Laboratories, Inc.), CELEBREX (manufactured by much as 20% by weight or more. 

Mosanto), VIOXX (manufactured by Merck), or In therapeutic applications, compositions are adminis- 

CYCLOSPORINE (manufacmrcd by Novartis). Other com- 25 lered to a patient already suffering from a disease, as just 

pounds that can be used to treat inflammatory diseases and described, in an amount sufficient to cure or at least partially 

which can be combined with certain compositions of the arrest the symptoms of the disease and its complications. An 

invention can be found in the Physician's Desk Reference appropriate dosage of the pharmaceutical composition or 

(1998), which is incorporated herein by reference in its polypeptide of the invention is readily determined according 

entirety. The complementary compounds used in combina- 30 to any one of several well-established protocols. For 

tion with PX3.101, fragments thereof and/or antibodies example, animal studies (e.g., mice, rats) are commonly 

thereto, typically have a different mode of action than used to determine the maximal tolerable dose of the bioac- 

PX3.101 or fragments thereof and/or differ with respect lo tive agent per kilogram of weight. In general, at least one of 

the time period during which they are therapeutically effec- the animal species tested is mammalian. The results from the 

tive. Thus, for example, a pharmaceutical composition of the 35 animal studies can be extrapolated to determine doses for 

invention includes a therapeutically effective amount of use in other species, such as humans for example. 

PX3.101 (or an active fragment thereof) in combination with What constitutes an effective dose also depends on the 

ENBREL since early studies indicate that the two com- nature and severity of the disease or condition, and on the 

pounds appear to have different mechanisms of action and general state of the patient's health, but will generally range 

different lime periods during which the therapeutic effect is 40 from about 1 to 500 mg of purified protein per kilogram of 

maintained once treatment is stopped. body weight, with dosages of from about 5 to 100 mg per 

The compositions may also include, depending on the kUogram being more commonly employed, 

formulation desired, pharmaceutically-acceptable, non-toxic In prophylactic applications, compositions containing the 

carriers of diluents, which are defined as vehicles commonly compounds of the invention are administered to a patient 

used to formulate pharmaceutical compositions for animal 45 susceptible to or otherwise at risk of a particular disease or 

or human administration. ITie diluent is selected so as not to infection. Such an amount Is defined to be a "prophylacti- 

affect the biological activity of the combination. Examples cally effective" amount or dose. In this use, the precise 

of such diluents are distilled water, buffered water, physi- amounts again depends on the patient's state of health and 

ological saline, PBS, Ringer's solution, dextrose solution, weight. Topically, the dose ranges from about 1 to 500 mg 

and Hank's solution. In addition, the pharmaceutical com- 50 of purified protein per kilogram of body weight, with 

position or formulation may also include other carriers, dosages of from about 5 to 100 mg per kilogram being more 

adjuvants, or non-toxic, nonthcrapeutic, nonimmunogenic commonly utilized. 

stabilizers, excipients and the like. The compositions may The pharmaceutical compositions described herein can be 

also include additional substances to approximate physi- administered in a variety of different ways. Examples 

ological conditions, such as pH adjusting and buffering 55 include administering a composition containing a pharma- 

agents, toxicity adjusting agents, wetting agents, detergents ceutically acceptable carrier via oral, intranasal, rectal, 

and the like. topical, intraperitoneal, intravenous, intramuscular, 

ITie composition may also include any of a variety of subcutaneous, subdermal, transdermal, intrathecal, and 

stabilizing agents, such as an antioxidant for example. intracranial methods. 

Moreover, the polypeptides may be complexed with various 60 For oral administration, Ihe active ingredient can be 

well-known compounds that enhance the in vivo stability of administered in soHd dosage forms, such as capsules, 

the polypeptide, or otherwise enhance its pharmacological tablets, and powders, or in liquid dosage forms, such as 

properlies (e.g., increase the half- life of the polypeptide, elixirs, syrups, and suspensions. The active component(s) 

reduce its toxicity, enhance solubility or uptake). Examples can be encapsulated in gelatin capsules together with inac- 

of such modifications or complexing agents include the 65 tive ingredients and powdered carriers, such as glucose, 

production of sulfate, gluconate, citrate, phosphate and the lactose, sucrose, mannitol, starch, cellulose or cellulose 

like. The polypeptides of the composition may also be derivatives, magnesium stcarate, stearic acid, sodium 
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saccharin, talcum, magnesium carbonate. Examples of addi- for parental administration are also sterile, substantially 

tional inactive ingredients that may be added to provide isotonic and made under GMP conditions, 

desirable color, taste, stability, buffering capacity, dispersion V. Uses 

or other known desirable features are red iron oxide, silica pharmaceutical compositions of the present invention 

gel, sodium lauryl sulfate, titanium dioxide, and edible white 5 can be used to treat a variety of diseases. For example, the 

ink. Similar diluents can be used to make compressed pharmaceutical compositions can be used m treatmg various 

tablets. Both tablets and capsules can be manufactured as inflammatory diseases. As described in mote detail m 

sustained release products to provide for continuous release ^x^^^P^^ ^' ^^"i"* compositions of the invention have been 

of medication over a period of hours. Compressed tablets ^^own to be effective m treatmg rheuniatoidarthntis in 

can be sugar coated or film coated to mask any unpleasant lu animal model studies In partic^^^^^^ 

. , , ,u . u^ . e ♦u . u * enzymes that are mvolved m the pathogenesis of rheumatoid 

taste and protect the tablet from the atmosphere, or enteric- ^^^^^.^.^ ^^^^ cyclooxygenases, phospholipases, 

coated for selective disintegration m the gastromtestmal lipoxygenases, and various profeases. PX3 101 polypeptide 

tract. Liquid dosage forms for oral administration can con- ^as also been shown to inhibit interaction between cytokines 

tain coloring and flavormg to increase patient acceptance. their receptors (see Example VI), such as IL-8/CXCR2 

If desired, it is possible to formulate solid or liquid 15 interaction for example. IL-8 is a major chemokine that 

formulations in an enteric-coated or otherwise protected regulates the inflammatory process (see e.g., Harada, et al, 

form. In the case of hquid formulations, the formulation can (1994) J. Leukoc. Biol. 56:559). There is also evidence that 

be mixed or simply coadministered with a protectant, such links IL-8 to tumor angiogcncsis and mmor metastasis 

as a liquid mixture of medium chain triglycerides, or the (Koch, et al. Science 258:1798). Thus some polypeptides of 

formulation can be filled into enteric capsules (e.g., of soft 20 the invention can be used in treating cancer, autoimmune 

or hard gelatin, which are themselves optionally additionally diseases, and/or other inflammatory diseases associated with 

enteric coated). Alternatively, solid formulations comprising chemokine imbalances, especially diseases correlated with 

the polypeptide can be coated with enteric materials to form IL-S such as Alzheimer disease. 

tablets. The thickness of enteric coating on tablets or cap- PX3.101 and other polypeptides of the invention also find 

sulcs can vary. Typical thickness range from 0.5 to 4 microns 25 use in inhibition and kinetic investigations. For example, 

in thickness. The enteric coating may comprise'any of the PX3.I01 can be used in studies into the interaction between 

enteric materials conventionally utilized in orally adminis- chemokines and receptors therefor, for example the interac- 

trable pharmaceutical formulations. Suitable enteric coating tion between IL-8 and CXCR2 and CXCRl. Methods 

materials are known, for example, from Remington 's Phar- involving inhibition of chemokines generally involve ailow- 

maceutical Sciences, Mace Publishing Company, 30 ing a quantity of the chemokine and receptor to admix with 

Philadelphia, 17th ed. (1985); and Hagars Handbuch der a polypeptide of the invention. More specifically, the method 

Pharmazeulischen Praxic, Springer Verlag, 4'^* ed., Vol. 7a involves adding a polypeptide of the invention to a sample 

(1971). containing the chemokine and receptor and preferably mix- 

Another delivery option involves loading the composition ing the resuhing mixture. Additions may-be made to an in 

into Hpid-associated structures (e.g., liposomes, or other 35 vitro solution or directly into a patient, 

lipidic complexes) which may enhance the pharmaceutical PX3.101 is also useful in studies or treatments involving 

characteristics of the polypeptide component of the compo- inhibition of various enzymes such as cyclooxygenases (for 

sition. The complex containing the composition may sub- example, COXl and C0X2), phospholipases (for example, 

sequently be targeted to specific target cells by the incor- phosphoHpase A2 and phosphohpase C), lipoxygenase, and 

poration of appropriate targeting molecules (e.g., specific 40 various proteases (for example, trypsin and cathepsin G). 

antibodies or receptors). It is also possible to directly com- Such methods generally involve allowing an enzyme, par- 

plex the polypeptide with a targeting agent. ticularly those involved in inflammatory processes, to admix 

Compositions prepared for intravenous administration with a polypeptide of the invention. Typically, a quantity of 

typically contain 100 to 500 ml of sterile 0.9% NaCl or 5% polypeptide is added to a sample containing the enzyme of 

glucose optionally supplemented with a 20% albumin solu- 45 interest. Here, too, additions may be made into an in vitro 

tion and 100 to 5(M) mg of a polypeptide of the invention. A solution or directly into a patient. 

typical pharmaceutical aimposition for intramuscular injec- Based upon the foregoing activities, it is also expected 

tion would be made up to contain, for example, 1 ml of that certain polypeptides of the invention can be useful in 

sterile buffered water and 1 to 10 mg of the purified treating blood coagulation diseases, accelerating wound 

polypeptide of the invention. Methods for preparing su healing, UV-light protection, reducing various aging phe- 

parenterally administrablc compositions arc well-known in nomenon and as a pain analgesic. 

the art and described in more detail in various sources. By using PX3.101 to study the interaction between 

including, for example. Remington's Pharmaceutical chemokines and their receptors, or the direct interaction 

Science, Mack Publishing, Philadelphia, Pa., 17th ed., between PX3.101 and a cognate receptor (e.g., the amino 

(1985). 55 acid sequence which binds to a receptor), small molecules 

Particularly when the compositions are to be used in vivo, which mimic this interaction can be developed, thus 

the components used to formulate the pharmaceutical com- enabhng the small molecule to be used to obtain a thera- 

positions of the present invention are preferably of high peutic effect similar to that obtained using PX3.101. 

purity and are substantially free of potentially harmful The nucleotide and peptide sequences of PX3.101 is also 

contaminants (e.g., at least National Food (NF) grade, 60 useful to generate primers and/or probes to .screen Ibr 

generally at least analytical grade, and more typically at least PX3.101 homologues in different species, particularly in 

pharmaceutical grade). Moreover, compositions intended human. Human homologues of PX3. 101 can be directly used 

for in vivo use are usually sterile. To the extent that a given as a therapeutic material or as a target to screen drug 

compound must be synthesized prior to use, the resulting candidates for various human diseases, 

product is typically substantially free of any potentially toxic 65 The following examples arc offered to further illustrate 

agents, particularly any endotoxins, which may be present specific aspects of the present invention and are not to be 

during the synthesis or purification process. Compositions interpreted so as to limit the scope of the present invention. 
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EXAMPLE I 

General Methods and Materials 

A. Electrophoresis and Western Blotting 

SDS polyacrylamide gel electrophoresis (SDS-PAGE) 
was carried out according to standard methods (Sambrook, 
el al. Molecular Cloning: A Laboratory Manual, 2"'^ ed., 
Cold Spring Harbor Laboratory Press, 1989). Purified 
PX3.101 protein was electrophoresed in 10-20% SDS- 
polyacrylamide gradients gel (BioRad, Richmond, Calif.) 
and then transferred onto nitrocellulose membrane 
(Schleicher & Schull, Keene, N.H.). The sera collected from 
animals treated with PBS or PX3.101 (200 /fg/kg) was 
diluted 1:30 in blocking solution (PBS, 0.2% Tween-20, 5% 
dry milk). The blot was divided using a mini-protean II 
multi-screening apparatus and probed with diluted sera. 
Horseradish Peroxidase (HRP) -conjugated goat anti-mouse 
IgG at 1:10, 000 dilution was used as the secondary anti- 
bodies. Signals were visualized using an ECL (enhanced 
chemi-luminesence) system. 

B. Mass Spectral Analysis 

Mass spectral analysis was carried out by Heck Facility at 
Yale University according to the protocols outlined on their 
website (http://www.info.med. vale xdu/wmkeck). Matrix 
assisted laser desorption ionization mass spectrometry 
(MALDI-MS) was used to determine the molecular weights 
of the peptide fragments from the trypsin digest of PX3.101, 
and both MALIDI-MS and electrospray ionization mass 
spectrometry (ESMS) were used to determine the molecular 
weights of the purified PX3.101 mutual derivatives. 
MALDI-MS was carried out on a research grade, VG 
Tofspec SE instrument equipped with delayed extraction and 
a reflectron. ESMS was carried out on a Micromass Q-Tof 
mass spectrometer. 

C. Reverse Phase (RP) HPLC 

Reverse phase HPLC (RP-HPLC) was performed using a 
Varian Dynamax Model SD-200 solvent delivery module 
and a UV detector (Dynamax Absorbancc Detector Model 
UV-C). Data acquisition was achieved by using Varian 
Dynamax Method (Version 1.4.6) software. A Varian 
Microsorb-MV C-18 reverse phase column (0.46 cmx25 
cm) was typically used for analytical analyses, while a 
semi-prep C-18 reverse phase column (1.0 cmx25 cm), 
Varian Dynamax 300 A, was used mainly for purification 
purposes. Buffers used for elution were Buffer A (water with 
0.1% trifluoroacetic acid) and Buffer B (acetonitrile with 
0.1% trifluoroacetic acid). Elution was achieved using a 
discontinuous linear gradient formed from buffers A and B. 
Columns were nm at room tempera mre. (Flow rate: 1 
ml/min for analytical HPLC; 4 ml/min for semi-prep). Peak 
fractions were collected by hand. 

D. HPLC — ^lon Exchange Chromatography 

HPLC-ion exchange chromatography was also used for 
the purification of PX3.101. The same solvent delivery 
system described above was used, A Varian Hydropore 
strong cation exchange column (0.46 cmxlO cm) was used. 
Buffers used for elution were Buffer A (0.1 M ammonium 
formate in water, pH 5.8) and Buffer B (1 M ammonium 
formate, pH 6.7). Elution was achieved using a discontinu- 
ous linear gradient formed from buffers A and B. Columns 
were run al room temperature at a flow rate of 1 ml/min. 

E. Amino Acid Analysis 

Amino acid analysis was carried out on a Beckman Model 
6300 ion-exchange instrument following a 16 hr hydrolysis 
at 115^ C. in 100 of 6 N HCl, 0.2% phenol that also 
contained 2 nM norleucine. Norleucine served as an internal 
standard to correct for losses that might occur during sample 
transfers and drying, etc. After hydrolysis, the HCl was dried 
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in a Speedvac and the resulting amino acids dissolved in 100 
/il Beckman sample buffer that contained 2 nM homoserine 
that acted as a second internal standard to independently 
monitor transfer of the sample onto the analyzer. The 

5 instrument was calibrated with a 2 nM mixture of amino 
acids and was operated according to the manufacturer's 
programs and using the manufacturer's buffers. Data analy- 
sis was carried out on an external computer using Perkin 
Ehner/Nelson data acquisition software. Improved quanti- 

10 tation of cysteine was obtained by prior oxidation with 
performic acid in a second sample. This procedure converted 
both cysteine and cystine to cysteic acid, Performic acid 
oxidation may destroy tyrosine. 

F. N-terminal Sequencing 

15 Direct N-terminal amino acid sequencing of PX3. 101 was 
carried out as previously described (Stone, et al. In Tech- 
niques in Protein Chemistry, Academic Press, 1992, New 
York, 23-34). PX3.101 purified by SDS-PAGE was elec- 
troblotted onto a Mini-Problott pure PVDF membrane 

20 (Applied Biosystems, Foster City, CaliL). The band was 
visualized by Ponceau S and cut for N-terminal protein 
sequencing. Alternatively, PX3.101 from Sephadex G-50 
chromatography was farther purified and mutual derivatives 
were resolved by RP HPLC and HPLC-ion exchange chro- 

25 matography as described above. N-terminal protein 
sequencing of either form was carried out by automated 
Edman degradation with an AppHed Biosystems sequencer 
(Model Precise 494 cLc, Foster City, Calif.). An on-line 
HPLC-analyzer was used for the identification of phenylth- 

30 iohydantoin (PTH) amino acids. 

G. Internal Sequencing 

Internal protein sequencing was performed according to 
standard methods (Stone, et al. A Practical Guide to Protein 
and Peptide Purification for Microsequencing, 2'"' ed. Aca- 

35 demic Press, 1993, New York, 43-69). A Coumassie Blue 
stained SDS-PAGE gel band was cut and subjected to in-gel 
trypsin digestion. Peptide firagments were subsequently 
extracted and analyzed by LC/MS/MS analysis. The resulted 
MS/MS spectra were compared with spectra in the National 

40 Center for Biotechnology Information (NCBI) non- 
redundant database to determine whether the protein was 
known. When no protein was identified using this method, 
the digest was purified by reverse phase HPLC. Detected 
peak fractions were collected and selected peptides were 

45 subjected to mass spectral analysis. N-terminal amino acid 
sequencing of these peptides was later carried out as 
described in Example I. 

EXAMPLE II 

50 Protein Purification and Characterization 

A. Fractionation of Bee Venom 

Lyophilized honeybee venom (approximately 0.5 g) 
(Apitronic Services, Richmond, British Columbia, Canada) 
or honeybee venom in suspension (approximate 0.5 g solid 

55 material per ml, Sigma, St. Louis, Mis.) was diluted in 10 ml 
of O.IM ammonium formate buffer (pH 4.6) to give a 
solution having a concentration of approximately 0.05 g 
venom/ml. This solution was centrifuged and filtered 
through 0.45 ^an filter, and then loaded onto a Sephadex 

60 G-50 column (two columns, each 1.5x170 cm (diameterx 
length) that were connected in series) pre-equilibrated with 
0.1 M ammonium formate buffer (pH 4.6). The column was 
eluted at about 0.6 ml/min, and fractions of 100 drops 
(approximately 4.0 ml) were collected. 

65 Fractions containing PX3.101 (fractions 65 to 72) 
appeared between the peaks of phospholipase A2 and melit- 
tin tetramer (FIG. 1). Fractions in the shoulder peak 
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(fractions 65 to 72) were analyzed by SDS gel electrophore- 
sis. Only one major band was found to have molecular 
weight lower than phosphohpase A2. Protein in this band 
had an apparent molecular weight of approximately 7700 
daltons and was called PX3.101. Melittin monomer has a 5 
molecular weight of 3()(K) daltons. 

B. Purification of PX3.101 

Sephadex G-50 column fractions containing PX3.101 (as 
identified by SDS PAGE) were pooled and PX3.101 
enriched by RP HPLC as described in Example I. A typical lo 
elution profile obtained during this RP HPLC step is shown 
in FIG. 2A. Fractions containing PX3.101 as confirmed by 
SDS-PAGE and trypsin inhibition assay (data not shown), 
were freeze-dried and further purified by ion-exchange 
HPLC as described in Example L An elution profile is shown 15 
in FIG. 2B. A second RP HPLC served as the final step of 
the purification and was eluted according to the conditions 
described in Example J. As shown in the elution profile 
depicted in HG. 2C, three major peaks were obtained, all of 
them with shoulders, in this final step. The main peaks were 20 
named Puri-#1, Puri-#3 and Puri-#3. These fractions had 
similar molecular weights based on SDS-PAGE, and all of 
them showed similar trypsin inhibition activity (data not 
shown). N-terminal sequencing of the main peaks confirmed 
that all of them were PX3.101 but with one to several 25 
N-tenninal amino acids deleted (Table 11). These molecules 
were considered mutual derivatives of PX3.101. These 
results are consistent with the direct N-terminal sequencing 
results in which the major PX3.101 band from the SDS- 
PAGE yielded multiple sequences (see below). The Puri-#1, 30 
Puri-#2 and Puri-#3 fractions were combined and used in the 
animal studies as well as mechanism of action studies of 
PX3.101 (see Examples V and VI). 

C. Characterization 

The major gel band from the SDS gel run with G-50 35 
fractions was cut and subjected to in-gel trypsin digestion. 
The resulting digest was analyzed by LC/MS/MS. Ten to 
twelve major peptide fragments were formed as a result of 
digestion and their molecular weights determined (data not 
shown). No peptides in the NCBI non-redundent database 40 
were found to match the predicated molecular weights of 
peptides from trypsin digestion. 

The peptide fragments from the trypsin digestion were 
then purified by RP-HPLC. Detailed protocols for the puri- 
fication are described on the website (http:// 45 
info.med.vale.edu/wmkeck/prochem.htm). 'Hie peak frac- 
tions (#47, 61, 62, 65, 75, 88 and 106) were collected and 
further analyzed by mass spectroscopy. Four peptide frag- 
ments (fractions #47, 62, 75 and 88) were selected for amino 
acid sequencing. The amino acid sequences of the peptides 5U 
are shown in Table I below. 

Direct N-terminal sequencing of the major SDS gel band 
blotted to PVDF membrane was also tried, but yielded 
multiple amino acids in each cycle of the sequencing (data 
not shown). This result suggested either major contaminants 55 
or the presence of multiple forms of PX3.101 that were 
mutual derivatives. Further study indicated the presence of 
multiple forms of this molecule resulted from deletions of 
different numbers of amino acids from the N-terminus (see 
below). 60 

Polypeptides contained in collected Fractions from the 
final RP-HPLC column (i.e., Puri-#1, Puri-#2, and Puri-#3; 
see FIG. 2C), had similar molecular weights based on 
SDS-PAGE, and all of them showed similar trypsin inhibi- 
tion activity (data not shown). N-terminal sequencing of the 65 
main peaks confirmed that all of them were PX3.101 but 
with one to several N-terminal amino acids deleted (Table 



II). These molecules were considered mutual derivatives of 
PX3.101. These results are consistent with the direct 

N-terminal sequencing results in which the major PX3.101 
band from the SDS-PAGE yielded multiple sequences (see 
above). 

A predicted molecular weight of about 7,700 daltons 
correlated well with the migration of PX3.101 on SDS -gel 
but did not correspond well with where the protein eluted in 
the elution profile for the Sephadex G-50 sizing column. The 
fact that the fraction containing PX3.101 eluted between 
Phospholipasc A2 (MW 19,000) and melittin tetramer (11, 
400) indicated an apparent molecular weight between 
11,400 and 19,000. Tliis discrepancy suggested that either 
the PX3.101 was present as a dimer, or that there was 
significant post-translational modification of the protein. Gel 
electrophoresis results obtained under non-reducing condi- 
tion indicated the presence of dimers of PX3.101 molecules 
(data not shown). 

A glucose assay did not show any glycosylation of 
PX3.101 (data not shown). The molecular weights of Puri- 
#1, Puri-#2 and Puri-#3, as determined by mass spectral 
analysis matched well with the molecular weights predicted 
from their amino acid sequences (Table 11). These resuhs 
suggested that there are no post-translational modifications 
of these derivatives. The C-tcrminals of these mutual deriva- 
tives are likely to be intact and to be the same. 

The amino acid analysis of the purified PX3.101 used for 
animal studies showed that it had an extinction coefiScient at 
280 nm of 0.286 ml/mg/cm. Protein amount in any given 
preparation was determined by its absorbance at 280 nm and 
the extinction coeflBcient. 

EXAMPLE III 

Cloning PX3.101 cDNA and Predicted Protein 

Sequence 

A. Method 

The degenerate oligonucleotide primer (5' ATGGATC- 
CAAYGARATHTTYWSNAG 3'— SEQ ID NO:8) Y=C or 
T; R=A or G; H=A or C or T; W=A or T; N=A or C or G or 
1) was designed based on the amino acid sequence 
(NEIFSR— SEQ ID NO: 9) obtained from proteiu sequenc- 
ing. All the PX3.101 primers and Oligo(dT) 

(5'TrGCGGCCGcrnTnTnTrnTmT3'— SEQ id 

NO: 10) were synthesized and purified by Genemed 
Synthesis, Inc. 

The total RNA of the honeybee venom gland was pre- 
pared as previously described (C^omczynski, et al, (1 995) 
Anal Biochem. 225:163). Venom glands were collected from 
honeybee Apis mellifera by Apitronic Services (Richmond, 
British Columbia, Canada) and stored at -80*^ C. 100 venom 
glands in STAT-60 solution (TEL-TEST "B" Inc. TX) were 
homogenized using a glass-Teflon homogenizer. Total RNA 
was extracted and isolated using the RNA isolation kit from 
TEL-TEST "B" Inc. 

The first strand cDNA was synthesized by reverse tran- 
scription using the total RNA from honey bee gland as the 
template and Oligo(dT) as the primer. The PX3.101 gene 
fragment was amplified by PGR using the degenerate 
PX3.101 primer and oligo(d'l). The amplified DNA frag- 
ments were cloned into the NotI and BamHI sites of pRlue- 
scripl sk(+) vector. The sequences of the DNA fragments 
were obtained through contracted service from Genemed 
Syntheses, Inc. The predicted protein sequences were ana- 
lyzed to see if they matched peptide sequences obtained 
through protein sequencing: i.e., sequences: a) PSNEIFSR 
(SEQ ID NO: 11) (residues 38 to 45 of SEQ ID N0:2), b) 
GFGGFGGLGGR (SEQ ID N0:12) (residues 24 to 34 of 
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SEQ ID N0:2), c) VCVPR (SEQ ID N0:13) (residues 84 to Most of the proteins above are extracellular proteins medi- 

88 of SEQ ID NO:2), or d) PNWPK (SEQ ID N0:14) ating different signal transduction pathways and have more 

(residues 55-60 of SEQ ID NO:2). than one cysteine-rich motif. Tectorin, the protein associated 

To get the full-length cDN A of PX3. 101 gene including with hearing disability, has three such cysteine-rich 

the coding sequence for the amino terminals and the signal 5 domains. IgG Fc binding protein has as many as twelve, 

peptide, 5'-RACE (Rapid Amplification of cDNA End) A database search identified several protein candidates as 

system (Gibco, MD) was used. Oligonucleotide primers potential homologues of PX3.101. All of them are small 

ASEQ2 (5- ATCGCGGAACGCA 3'— SEQ ID N0:15) and P^^^fi^i^* ^ fgnal peptide and at least one cysteme-rich 

ASEQ3 (5' AAGGATCCAAGTCTACATACAC 3'— SEQ ^'^^^ schematic structures and name or accession 

ID N0:16) were synthesized and used to amplified the 5' end lo "'^^"^^^ ^° 

of PX3.101 cDNA. The PGR products were cloned into EXAMPLE IV 

BamHI and Sail sites of pBluescript sk(+) vector and Expression and Purification of Recombinant 

sequenced. The predicted protein sequences were analyzed PX3.101 Protein 

to see if they matched the peptide sequence obtained through Method 

protein sequencing: GFGGFGGLGGR (SEQ ID N0:12) 15 To generate the i^combinant PX3.101 protein, the full- 

(residues 24 to 34 of SEQ ID N0:2). The protein and gene length cDNA of PX3.101 was cloned into pFastBacHTb, a 

sequences of PX3.101 were analyzed by searching the baculovirus expression vector, in-frame with coding 

database to identify any structural or functional motifs. sequence for His-tag (Gibco, MD) (see FIG. 8). This virus 

B> Results expression vector was designed for high-level productions 

A DNA fragment, containing coding sequence for pep- 20 and rapid purification of the recombinant protein (Lukow, et 

tides (NEIFSR (SEQ ID N0:1 0)— residues 40 to 45 of SEQ al, Bio/Technology, 1989, 6:47). 

ID NO:2), (VCVPR (SEQ ID NO: 13)— residues 84 to 88 of pFastBacIITb-PX3.101 was used to transform DHlODac 

SEQ ID N0:2), and (PNVVPK (SEQ ID NO: 14)— residues cells (Gibco) for transposition into the bacmid. The recom- 

55 to 60 of SEQ ID N0:2), was discovered. This DNA ^^^^^^ l^^^id containing PX3.101 cDNA was isolated and 

fragment is part of PX3-101 gene. The full-length PX3.101 25 assessed by PGR. To generate the baculovirus, SF9 ceUs 

gene (472 base pairs; SEQ ID N0:1) was isolated from the I'^^^^^^g^^' ^^.^^•) ^^^^ ^^^^^^^^ recombinant 

honeybee cDNA library. It contains a 276 base pair coding ^^^f Following 5 days of mcubation at 30 C, the virus 

sequence (residues 74 to 349 of SEQ ID N0:1, a 5' end stock was collected and u^d to in^^^^ 

7 . ^, J ^ . J high titer virus stock. The hter or virus stock was determmed 

untranslated region, a 3 end untranslated region, and a bv la ue assav 

poly(A) tail. The predicted PX3.101 protein consists of 92 30 ^^'^pt^^^ze the condition to generate PX3.101 protein, 
amino acids (SEQ ID N0:2) including peptide the recombinant baculovirus stock was used to infect High 
(GFGGFGGLGGR (SEQ ID NO: 12)— residues 24 to 34 of pj^c cells (Invitrogcn, Calif.) an insect cell line gcneraUy 
SEQ ID N0:2). llie nucleotide and predicted ammo acid expressing significantly higher levels of recombinant pro- 
sequence of PX3.101 are shown in FIG. 3A. teins compared to other insect cells (Wickham, et al, Sio- 

Like other secreted molecules, PX3. 101 protein consists 35 teclmol. Prog., 1992, 8:391). High-five cells in mid-log 

of a 19 amino acid signal peptide at the N-terminus (FIG. phase of growth in one liter of serum-free medium were 

3B; residues 1-19 of SEQ ID N0:2). The coding sequence infected with the recombinant baculovirus stock (1 :100 v/v). 

for the PX3.101 signal peptide can be used to construct After incubation at 30** C. for 96 hrs, cells were harvested by 

expression vector, to express recombinant proteins in centrifugation and lysed using guanidinium lysis buffer (6M 

secreted form. 40 guanidine hydrochloride, 20 mM sodium phosphate, 500 

The secreted and natural PX3.101 protein in honeybee mM sodium chloride, pH 7.8). 

venom starts with five GGX repeats (FIG. 3B; residues 20 After centrifugation, the supernatant was collected and 

to 34 of SEQ ID N0:2). GGX repeats are present in several incubated with pre-equilibrated PROBOND resin 

structure proteins, including keratin (CAA28991), abducin (Invitrogen, Calif.) at 4^ C. for 3 hours. The column was 

(2739489), fibrillarin (P22232), elastin (207462), spider silk 45 washed sequentially with two bed volumes of the following 

protein (AAC38847), precollagen D (2772914) and precol- buffers twice: denaturing binding bufifer (8 M urea, 20 mM 

lagen P (2388676) of mussel byssus, homeotic protein sodium phosphate, 5(K) mM sodium chloride, pH 7.8), 

Spalt-accessory (AAC38847), putative immediate early pro- denaturing wash buffer 1 (8 M urea, 20 mM sodium 

tein of Alcelaphine herpesvirus 1 (2338034), EBNA-1 of phosphate, 500 mM sodium chloride, pH 6.0), and denatur- 

Epstein-Barr virus (P0321 1), and many Mycobacterium 50 ing wash buffer 2 (8 M urea, 20 mM sodium phosphate, 500 

tuberculosis proteins (CAA17751, CAA15537, CAA17576, mM sodium chloride, pH 5.3). Recombinant PX3.101 was 

CAA17749). GGX repeats form a condensed helical struc- eluted from the column using denaturing elution buffer (8 M 

ture and may be involved in formation of polymers. urea, 20 mM sodium phosphate, 500 mM sodium chloride. 

Interestingly, auto-antibodies against keratin, fibrillarin or pH 4.0). The His tag was removed by rTEV protease 

elastin are found in rheumatoid arthritis patients. 55 digestion. 

The C-terminus of PX3.i01 is cysteine rich (FIG. 3B; B. Results 

residues 35-92 of SEQ ID N0:2. In 58 amino acids, there Recombinant PX3.101 is expressed and soluble in dena- 

are 10 cysteines. A cysteine-rich motif such as this is present turing lysis buffer. After removal of His tag by protease 

in a group of proteins, including tectorin (CAA68138), rTEV digestion, recombinant PX3. 101 is almost identical in 

zonadhesin (3327421), IgG Fc binding protein 60 size on vSDvS-polyacryl amide gel electrophoresis. Recombi - 

(AAD15624), von Willebrand factor (C A A27765), ECM 18 nant PX3.101 was purified using PROBOND affinity col- 

(1100979), mucin (AF015521), hemocytin (P980920), umn followed by HPLC. About 20 mg PX3. 101 protein was 

SCO-spodin (CAA69868), tumor necrosis factor receptor II obtained from 1 liter of cell culture. 

(2739045), a chymotrypsin inhibitor from honeybee After protein refolding, recombinant PX3. 101 protein (•) 

(4699856), anti-coagulant protein C2 (1203803), and sev- 65 and natural PX3.101 protein (A) show equivalent activities 

eral proteins with similarity to EGF-like domain ininhibitingthcbindingof IL-8 to its receptor CXCR2 (FIG. 

(CAA98455, AF016450, 1226303, U70857, 1226304). 7B). 
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EXAMPLE V 

Animal Studies — Collagen Induced Arthritis Mouse Studies 

A. General Method 

The collagen-Induced Arthritis (CIA) animal model is 
widely acknowledged as the most appropriate in vivo model 5 
system to test potential therapeutics to treat rheumatoid 
arthritis and is recommended by the Food and Drug Admin- 
istration for pre-clinical testing in preparation for an IND 
(Investigational New Drug) filing. 

The following two animal studies began with 8-10 week lo 
old DBA/IJ male mice from Jackson Laboratories (Bar 
Harbor, Me.). Disease was induced in all animals following 
the protocol described by Rosloniec, et al. (Current Proto- 
cols in Immunology, John Wiley & Sons, Inc. 1996). In brief, 
chicken collagen type II was dissolved in 10 mM acetic acid 15 
at 4 mg/ml and stirred overnight at 4° C. It is important that 
native collagen type II be kept cold while being dissolved to 
prevent its denaturation. Using a high-speed homogenizer, 
chicken type 11 collagen was emulsified in an equal volume 
of Complete Freund's Adjuvant (CFA) just prior to immu- 20 
nization. The solution is kept cold throughout the emulsifi- 
cation. 

On Day 1, DBA/lJLacJ mice were injected interdermally 
at the base of the tail with 0.1 mg of the chicken collagen 
Type II emulsified in CFA. On Day 21, a second identical 25 
injection was administrated. 

As set forth below, various treatments were administered 
subcutaneously, at various times into tissue of the upper 
back/shoulder area. Inflammation was recorded throughout 
each study, at least twice weekly, by counting the number of 30 
swollen toes, paws and ankles of each animal Each joint 
was assigned a score of either 0 (no inflammation) or 1 
(inflammation). According to this scoring system, a maxi- 
mum score of 28 (7 measurements per 1imbx4 hmbs) and a 
minimum score of 0 could be assigned to an animal at any 35 
single scoring occasion. This number, is representative of 
the disease Severity. 

B. Smdy 1 
1. Method 

Five groups of 10 mice each were treated as follows: 40 
Group 1: PX3.101 (200 /ig/kg) administered subcutane- 
ously for 15 days starting on Day 6. 
Group 2: Bee Venom (1000 jWg/kg) (obtained from 
Apitronic Services and dissolved in PBS administered 
subcutaneously for 30 days starting on Day 1. 
Group 3: Negative Control (Phosphate Buffered in normal 
Saline (PRS) administered subcutaneously for 30 days 
starting on Day 1). 
Group 4: INDOMETHACIN (positive control; available 50 
from Sigma Chemical Co., St. Louis, Mo.) adminis- 
tered orally for 30 days starting on Day 1 (1000/ig/kg). 
Group 5: Normal Control (same as Group #3, except mice 

received no collagen). 
On Day 52, blood samples were obtained from the Nega- 55 
tive Control (Group #3) and PX3.101 treated animals 
(Group #1) to evaluate the immunogenicity of PX3.101 in 
this animal model. Serum was prepared and tested by 
Western Blot analysis as described in Example I. 
2. Results 60 

Daily treatment of mice with PX3.101 at the does of 200 
fig/kg from Day 16 to Day 30 (Group #3) suppressed 
inflammation in CIA (collagen-induced arthritis) mice. Its 
activity was comparable to INDOMETHACIN, a known 
anti-arthritic drug (daily treatment at a does of 1 mg/kg, 65 
Group #4) (see FIG. 4A). Statistical significance when 
compared to the Negative Control Group #3 is p <0.05. 
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Histopathologic studies of joints from the mice treated 
with PX3.101 and PBS farther demonstrated the therapeutic 
activity of PX3.101 in suppressing inflammation (see FIGS. 
5A-5C). In FIGS. 5A-5C, the white space locateid near the 
center of the photographs is a space between the bones (large 
dark regions) in a joint of the mouse that contains synovial 
fluid. The small dark spots or granules, particularly notice- 
able in FIG. SB, are the nuclei of leukocytes (e.g., 
neutrophils, T-cells, macrophages, and other cells stimulated 
as part of an immune response) that have infiltrated the joint. 
These leukocytes actively degrade bone. 

FIG. 5A shows a normal joint wherein coUagen has not 
been injected to induce arthritis (Group 5). The dark bony 
material is smooth and undegraded and there are very few 
leukocytes present. In sharp contrast, many leukocytes were 
present in the joint from mice which were injected with 
collagen and then treated with PBS (Group 3, see FIG. 5B). 
In this negative control treatment group, bone erosion and 
penetration by leukocytes was observed and cartilage dam- 
age was obvious (see FIG. 5B). FIG. 5C is a photograph of 
a joint from a mouse from Group 1 that was treated with 
PX3.101. Very little bone erosion and cartilage damage was 
observed. There are also very few leukocytes present in the 
joint. This result suggests that PX3.101 can inhibit migration 
of leukocytes to inflammatory sites. The hypothesis is sup- 
ported by our findings that PX3.101 inhibits the interaction 
between chemokine IL-8 and its receptor CXCRl and 
CXCR2 (see FIG. 7A and Example VI). IL-8 is the major 
chemokine involved in inflammation. Its function includes 
recruiting neutrophils to the inflammatory site and activating 
them to release superoxide, proteases, and bioactive lipids. 

Western Blot analysis did not detect antibodies against 
PX3.101 in the serum of mice treated with PX3.101 at 200 
/fg/kg f(Sr 15 consecutive days. 

C. Study 2 
1. Method 

Four groups of 7-8 mice each were treated as follows: 

Group 1: PX3.101 (200 /Yg/kg) administered subcutane- 
ously for 15 days starting on Day 22. 

Group 2: PX3.101 (40 /^g/kg) administered subcutane- 
ously for 15 days starting on Day 22. 

Group 3: PX3.101 (8 jWg/kg) administered subcutaneously 
for 15 days starting on Day 22. 

Group 4: Negative Control (Phosphate Buffered in normal 
Saline (PBS) administered subcutaneously for 15 days 
starting on Day 22). 
2. Results 

Ilie effectiveness of various a)ncentrations of PX3.101 in 
suppressing inflammation in CIA (collagen-induced 
arthritis) was demonstrated. In this study, mice received 
PX3.101 treatment from Day 22 to Day 37, instead of Day 
16 to Day 30 as in Study 1 above. Among three dosages (8 
/ig/kg, 40 jMg/kg, and 200 /4g/kg) tested, 40 /ig/kg appears to 
be the optimal concentration (FIG. 6). 

Results from this study indicate that the activities of 
PX3.101 molecule in this animal model depend on its 
dosage in a non-linear manner. Such phenomena have been 
observed in many cases in pre-clinical or clinical 
investigations, for example INF-a soluble receptor and 
relaxin, a drug candidate in late stage clinical development 
for vScleroderma. 

In addition, a substantial sustained therapeutic effect was 
observed after the treatment with PX3.101 was 
discontinued, suggesting possible long-lasting effect of this 
molecule. 

D. Summary of Animal Studies 

Therapeutic potential was demonstrated for a purified 
component from honeybee venom identified as PX3.101. In 
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evaluating the data from the two studies, it appears that the 
in vivo activity of PX3.101 is dependent on its dosage and 
the time to start treatment. The most effective dosage of 
PX3.101 of the dosages tested was 40/ig/kg. 

5 

EXAMPLE VI 

Mechanism of Action Studies 

A. Method — Chemokine or Cyiokine/Receptor Binding 
Experiments to examine the effects of PX3.101 on iq 

chemokine or cytokine/receptor binding were carried out by 
Panlabs. For the inhibition of IL-8/CXCR2 binding, purified 
(naturally-occurring or recombinant) PX3.101 was added to 
0.2 ml reaction solution that contained 0.15 mg/ml of a 
membrane preparation of human recombinant CHO cells ^5 
expressing CXCR2, 0.015 nM ^^^I-labeled IL-8, and 10 nM 
unlabeled IL-8 to give a final PX3.101 concentration of 0, 
0.01, 0.1, 1.0 and 10 /iM. Reaction mixtures were incubated 
for 60 minutes at room temperature. Bound radioUgand was 
then separated from unbound radioligand and the radioac- 20 
tivity measured on a gamma counter. Similar experiments 
were carried out to examine the effect of PX3.101 on 
IL-8/CXCR1 interaction, where the membrane preparation 
of human recombinant CHO cells that expressed CXCRl 
was used and a single dose of PX3.101 (10 /^M) was tested. 25 
TNF-ct/TNF-a receptor binding experiments were carried 
out in a similar manner. Briefly, 10 PX3.101 was added 
to the reaction solution that contained 50 mM Tris-HCl (pH 
7.4), 0.5 mM EDTA, 0.028 nM '^I-labeled TNF-a, 40 nM 
unlabeled TNF-a, and preparation of human U937 cells that 30 
expressed TNF-a receptor. The mixture was allowed to 
incubate for 3 hours at 4** C. Bound radioligands were 
separated from unbound and the radioactivity was counted 
on the gamma counter. 

B. Results 35 
PX3.101 was found to specifically inhibit the interaction 

between IL-8 and CXCRl and the interaction between IL-8 
and CXCR2 (FIG. 7A). PX3.101 inhibited IL-8 and CXCR2 
interaction in a dose-dependent manner (FIG. 7A). Prelimi- 
nary tests show an IC50 of 0.5 «M, However, the binding of 40 
TNF-a to its receptor was not affected by PX3.101 (FIG. 
7A). 

IL-8 is a major chemokine that regulates the inflammatory 
process. There is also research suggesting it may also be 
involved in tumor angiogenesis and tumor metastasis (Koch, 45 
et al., (1992) Science 258:1798). Since PX3.101 inhibits the 
binding of IT..-8 to its receptors CXCRl and CXCR2, 
PX3.101 is expected to be effective in the treatment of 
cancer, inflammatory diseases, autoimmune diseases and " 
other diseases involving IL-8. Inhibition of the IL-8/CXCR2 50 
interaction by purified naturally-occurring PX3.101 and 
recombinant PX3.101 is shown in FIG. 7B. 

PX3.101 was also found to inhibit several enzymes 
involved in the pathogenesis of rheumatoid arthritis, includ- 
ing cyclooxygenases (COX 1 and C0X2), phospholipase 
A2, phospholipase C, lipoxygenase, and the proteases 
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trypsin and cathepsin G (data not shown). Several of these 
enzymes are either integrated in the phospholipid membrane 
(cyclooxygenases) or use fatty acids or phospholipids as 
their substrates (phospholipase A2, phospholipase C, 
lipoxygenase). Interestingly, PX3.101 inhibited COXl when 
the enzyme purified in lipid vesicles was used but showed no 
inhibition to the free enzyme in solution. This result suggests 
that the inhibitions of the lipid/fatty acid related enzymes 
might occur through non-specific interaction between 
PX3.101 and lipids/fatty acids. 

It is understood that the examples and embodiments 
described herein are for illustrative purposes only and that 
various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included 
within the spirit and purview of this application and scope of 
the appended claims. All publications, patents, and patent 
applications cited herein are hereby incorporated by refer- 
ence in their entirety for all purposes. 



TABLES 



TABLE I 



Selected fragments 


resiiltins; from in-gel trypsin 


digestion 


Peptide 




MW (D) 


Fragment SEQ ID NO: 


Amino Actd Sequence 


by Mas.s Spec 


#47 17 


V-X-V-P-R 


Not Determined 


#62 18 


X-P-S-N-E-I-F-S-R 


1124.2 


#75 12 


G-F-G-G-F-G-G-L-G-G-R 


9819 


#88 19 


X-X-P-N-V-V-P-K 


Not Determined 



*V - Valine, P = Proline, C = Cysteine, K » Lysine, S » Serine, I - 
Isoleucine, E - Glutamic acid, N - Ai^aragine, X - Unsure 



TABLE II 



N-Terminal Sequences for PX3.I01 protein fraoions from honey 
bee venom (FIG. 2C) 

N-terminal Se- Predicted 

quence of PX3.101 Measured \fW (D)* 

SEQ ID proteins in honey MW (D) based on 

Name NO: bee venom by mass spec sequence 



PuiiH*! 20 G-G-F-G-G-l^G-G-R-G 7178 7178 
Puti-#2 21 G-F-G-G-F-G-G-L-G-G 7405 7382 
Puri-#3 22 F-G-G-F-G-G-F-G-G-L 7586 7586 



*The molecular weights were predicted using IntelUGenetics program 
assuming that all the 10 cysteines form 5 pairs of disulfide bonds. It is 
also assumed that the C-tcrminus of the protein is intact and that there are 
no post-translationa] modifications of the molecule. 
G - Glycine, F = Phenylalanine, R - Arginine, L « Leucine, 



SEQUENCE LISTING 



<160> NUMBER OF SEQ ID NOS : 22 

<210> SEQ ID NO 1 
<211> LENGTH: 472 
<212> TYPE: DNA 
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<213> ORGANISM: Apie niellifera 

<220> FEATURE: 

<221> NAME /KEY: CDS 

<222> LOCATION: (74). .(352) 

<223> OTHER INFORMATION: honey bee venom PX3.101 protein 

<400> SEQUENCE: 1 

attcacagtg caacgtaagt tcttttcttc tttttttttt cgaaaanaca actttgtttg 60 

agaagaacaa aac atg tct cgt ctg gtt ctt gcc tec ttc ctt ctt ttg 109 
Met Ser Arg Leu Val Leu Ala Ser Phe Leu Leu Leu 
15 10 

gca att ttc tec atg ctt gtt gga gga ttt gga gga ttt gga gga ttt 157 
Ala He Phe Ser Met Leu Val Gly Gly Phe Gly Gly Phe Gly Gly Phe 
15 20 25 

gga gga ctt gga gga cgt ggt aaa tgt cca age aat gag ate ttc agt 205 
Gly Gly Leu Gly Gly Arg Gly Lys Cys Pro Ser Asn Glu He Phe Ser 

30 35 40 

aga tgc gat gga egg tgc caa cgt ttt tgc ccc aat gtt gtt act aaa 253 
Arg Cys Asp Gly Arg Cys Gin Arg Phe Cys Pro Asn Val Val Pro Lys 
45 50 55 60 

cct tta tgc ate aag ato tgt gca cca gga tgt gta tgt aga ctt ggt 301 
Pro Leu Cys He Lys He Cys Ala Pro Gly Cys Val Cys Arg Leu Gly 
65 70 75 

tat tta agg aat aaa aag aag gta tgc gtt ccg cga tet aaa tgc gga 349 
Tyr Leu Arg Asn Lys Lys Lys Val Cys Val Pro Arg Ser Lys Cys Gly 
80 85 90 

tgacttttat aattatttca tgattatttt atgattgttt aacaattatt gtattgtatt 409 

• ttatcattca taaaaattgt tatgttatta ttttatcagt aaaaaaaaaa aaaaaaaaaa 469 

aaa 472 



<210> SEQ ID NO 2 
<211> LENGTH: 92 
<212> TYPE: PRT 

<213> ORGANISM: Apis mellifera 
<400> SEQUENCE: 2 

Met Ser Arg Leu Val Leu Ala Ser Phe Leu Lou Leu Ala He Phe Ser 

15 10 15 

Met Leu Val Gly Gly Phe Gly Gly Phe Gly Gly Phe Gly Gly Leu Gly 
20 25 30 

Gly Arg Gly Lys Cys Pro Ser Asn Glu He Phe Ser Arg Cys Asp Gly 
35 40 45 

Arg Cys Gin Arg Phe Cys Pro Asn Val Val Pro Lys Pro Leu Cys He 
50 55 60 

Lys He Cys Ala Pro Gly Cys Val Cys Arg Leu Gly Tyr Leu Arg Asn 
65 70 75 80 

Lys Lys Lys Val Cys Val Pro Arg Ser Lys Cys Gly 
85 90 



<210> SEQ ID NO 3 
<211> LENGTH: 26 
<212> TYPE: DNA 

<213> ORGANISM; Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: forward 
primer ASEQIO 

<400> SEQUENCE: 3 



aaggatccac agtgcaacgt aagttc 



26 
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<210> SEQ ID NO 4 
<211> LENGTH: 55 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence : forward 
primer ASEQll 

<400> SEQUENCE: 4 

aaggatccgg aggatttgga ggatttggag gatttggagg acttggagga cgtgg 55 



<210> SEQ ID NO 5 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: reverse 
primer ASEQ13 

<400> SEQUENCE: 5 

actgataaaa taataac 17 



<210> SEQ ID NO 6 
<211> LENGTH: 16 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: reverse 
primer ASEQ14 

<400> SEQUENCE: 6 

atgaatgata aaatac 16 



<210> SEQ ID NO 7 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence :reverse 
primer ASEQ15 

<400> SEQUENCE: 7 

ttataaaagt catccgc 17 



<210> SEQ ID NO 8 
<211> LENGTH: 25 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence :degenerate 

oligonucleotide primer 
<221> NAME /KEY: modlfied_base 
<222> LOCATION: (23) 

<223> OTHER INFORMATION: n g, a, c or t 

<400> SEQUENCE: 8 

atggatccaa ygarathtty wenag 25 



<210> SEQ ID NO 9 
<211> LENGTH: 6 

<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: amino acid 
sequence obtained from protein sequencing 



<400> SEQUENCE: 9 



us 6,395,306 Bl 
41 42 

-continued 



Asn Glu lie Phe Ser Arg 
1 5 



<210> SEQ ID NO 10 
<211> LENGTH: 28 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence :Oligo(dT) 



<400> SEQUENCE: 10 

ttgcggccgc tttttttttt tttttttt 



<210> SEQ ID NO 11 
<211> LENGTH: 8 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: residues 
38-45 of SEQ ID NO: 2 obtained through protein 
sequencing; peptide fragment #75 from in- gel 
trypsin digestion 



<400> SEQUENCE: 11 

Pro Ser Asn Glu lie Phe Ser Arg 
1 5 



<210> SEQ ID NO 12 
<211> LENGTH: 11 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<22a> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: residues 
24-34 of SEQ ID NO: 2 obtained through protein sequencing 

<400> SEQUENCE: 12 

Gly Phe Gly Gly Phe Gly Gly Leu Gly Gly Arg 
1 5 10 



<210> SEQ ID NO 13 

<211> LENGTH: 5 

<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: residues 
84-8 8 of SEQ ID NO: 2 obtained through protein sequencing 

<4 00> SEQUENCE: 13 

Val Cys Val Pro Arg 
1 5 



<210> SEQ ID NO 14 
<211> LENGTH: 6 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: residues 
55-6 0 of SEQ ID NO: 2 obtained through protein sequencing 

<400> SEQUENCE: 14 

Pro Asn Val Val Pro Lys 
1 5 



<210> SEQ ID NO 15 
<2H> LENGTH: 13 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 
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<223> OTHER INFORMATION: Description of Artificial Sequence:5' RACE 
oligonucleotide primer ASEQ2 

<400> SEQUEHCE: 15 

atcgcggaac gca 13 

<210> SEQ ID NO 16 
<211> LENGTH: 22 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequencers' RACE 
oligonucleotide primer ASEQ3 

<400> SEQUENCE: 16 

aaggatccaa gtctacatac ac 22 



<210> SEQ ID NO 17 
<211> LENGTH: 5 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: peptide 

fragment #47 from in-gel trypsin digestion 
<221> NAME/KEY: MOD_RES 
<222> LOCATION: (2) 

<223> OTHER INFORMATION: Xaa - unsure amino acid 

<400> SEQUENCE: 17 

Val Xaa Val Pro Arg 
1 5 



<210> SEQ ID NO 18 
<211> LENGTH: 9 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: peptide 

fragment #62 from in-gel trypsin digestion 
<221> NAME/KEY: HOD_R£S 
<222> LOCATION: (1) 

<223> OTHER INFORMATION: Xaa « unsure amino acid 
<400> SEQUENCE: 18 

Xaa Pro Ser Asn Glu lie Phe Ser Arg 
1 5 



<210> SEQ ID NO 19 
<211> LENGTH: 8 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: peptide 

fragment #88 from in-gel trypsin digestion 
<221> NAME/KEY: HOD_RES 
<222> LOCATION: (1)..<2) 

<223> OTHER INFORMATION: Xaa « unsure amino acid 

<4 00> SEQUENCE: 19 

Xaa Xaa Pro Asn Val Val Pro Lys 
1 5 



<210> SEQ ID NO 20 
<211> LENGTH: 10 
<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence :N -terminal 
sequence from Puri-#1 



45 
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<400> SEQUENCE: 20 

Gly Gly Phe Gly Gly Leu Gly Gly Arg Gly 
1 5 10 



<210> SEQ ID NO 21 

<211> LENGTH: 10 

<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description o£ Artificial Sequence :N-terminal 
sequence from Puri-#2 

<400> SEQUENCE: 21 

Gly Phe Gly Gly Phe Gly Gly Leu Gly Gly 

15 10 



<210> SEQ ID NO 22 

<211> LENGTH: 10 

<212> TYPE: PRT 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence :N-terminal 
sequence from Puri-#3 

<400> SEQUENCE: 22 

Phe Gly Gly Phe Gly Gly Phe Gly Gly Leu 
1 5 10 



What is claimed is: 

1. An isolated nucleic acid molecule that comprises a 
polynucleotide sequence that encodes a polypeptide having 
an amino acid sequence at least 75% identical to an amino 
acid sequence as set forth in SEQ ID N0:2 over a region at 
least about 40 amino acids in length when compared using 
the BLASTP algorithm with a wordlcngth (W) of 3, and the 
BLOSUM62 scoring matrix, wherein the polypeptide is 
effective to reduce the symptoms of an inflammatory dis- 
ease. 

2. The nucleic acid of claim 1, wherein the polynucleotide 
sequence encodes a polypeptide having an amino acid 
sequence as shown in SEQ ID NO: 2. 

3. An isolated nucleic acid molecule that comprises a 
polynucleotide sequence at least 75% identical to a nucleic 
acid sequence set forth in nucleotides 74 to 349 of SEQ ID 
N0:1 over a region of at least 50 nucleotides in length when 
compared using the BLASTN algorithm with a wordlength 
(W) of 11, M=5, and N=-4 and encodes a polypeptide that 
is effective in reducing the symptoms of an inflammatory 
disease. 

4. The nucleic acid of claim 3, wherein the inflammatory 
disease is rheumatoid arthritis. 

5. The nucleic acid of claim 1, wherein the polynucleotide 
sequence hybridizes to a nucleic acid having a sequence as 
set forth in residues 74 to 349 of SEQ ID N0:1 under 
stringent conditions, wherein the stringent conditions are 
conditions in which the ionic strength is equivalent to a 
solution containing 0.01 to 0.1 M sodium ion, the pH is pH 
7.0 to 8.3 and the temperature is at leasl 30° C. for 
polynucleotides 10 to 50 nucleotides in length and at least 
60° C. for polynucleotides greater than 50 nucleotides in 
length. 

6. The nucleic acid of claim 1, wherein the polynucleotide 
sequence is as set forth in residues 74 to 349 of SEQ ID 
N0:1. 



7. The nucleic acid of claim 1, wherein the polynucleotide 
sequence is one that can be amplified using the forward 
primer 5* AAGGATCCACAGTGCAACGTAAGTTC 3' 
(SEQ ID N0:3) and reverse primer 5' ACT- 
GATAAAATAATAAC 3' (SEQ ID NO:5). 

8. The nucleic acid of claim 1, wherein the polynucleotide 
sequence is as set forth in SEQ ID NO:l. 

9. The nucleic acid of claim 1, wherein the polynucleotide 
40 sequence is derived from a sample from bee venom. 

10. The nucleic acid of claim 1, further comprising a 
promoter sequence operably linked to the polynucleotide 
sequence. 

11. A vector comprising a nucleic acid of claim 1. 

45 12. The vector of claim 11, wherein said vector is a 
baculovinis. 

13. A cell containing the vector of claim 11. 

14. A cell comprising a recombinant expression cassette 
comprising a promoter operably linked to a polynucleotide 

50 sequence which is at least about 75% identical to residues 74 
to 349 of the polynucleotide sequence as set forth in SEQ ID 
N0:1 over a region at least about 50 nucleotides in length 
when compared using the BLASTN algorithm with a 
wordlength (W) of 11, M=5, and N=-4 and which encodes 

55 a polypeptide that is effective in reducing the symptoms of 
an inflammatory disease. 

15. The cell of claim 14, wherein the insect cell line is a 
High Five insect cell line. 

16. 'llie nucleic acid of claim 14, wherein the inflamma- 
60 Lory disease is rheumatoid arthritis. 

17. The cell of claim 13, wherein the polynucleotide 
hybridizes to a nucleic acid having a sequence as set forth in 
residues 74 to 349 of SEQ ID N0:1 under stringent 
conditions, wherein the stringent conditions are conditions 

65 in which the ionic strength is equivalent to a solution 
containing 0.01 to O.IM sodium ion, the pH is pH 7.0 to 8.3 
and the temperature is at least 30** C. for polynucleotides 10 
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to 50 nucleotides in length and at least 60** C. for polynucle- 
otides greater than 50 nucleotides in length. 

18. The cell of claim 13, wherein the polynucleotide 
sequence is as set forth in residues 74 to 349 of SEQ ID 
N0:1. 5 

19. llie cell of claim 13, wherein the cell is an insect cell 
line. 

20. A method for producing a polypeptide, the method 
comprising the steps of: 

(a) culruring a host cell containing the nucleic acid of 
claim 1 under conditions suitable for the expression of 
the polypeptide; and 

(b) recovering the polypeptide from the host ceil culture. 

21. The nucleic acid of claim 1, wherein the inflammatory 
disease is rheumatoid arthritis, 

22. An isolated nucleic acid molecule comprising a nucle- 
otide sequence selected from the group consisting of 
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(a) a deoxyribonucleotide sequence complementary to 
nucleotides 74 to 349 of SEQ ID NO:l; 

(b) a ribonucleotide sequence complementary to nucle- 
otides 74 to 349 of SEQ ID N0:1; 

(c) a nucleotide sequence complementary to the deoxyri- 
bonucleotide sequence of (a) or to the ribonucleotide 
sequence of (b); 

(d) a nucleotide sequence of at least 23 consecutive 
nucleotides that hybridizes to nucleotides 74 to 349 of 
SEQ ID N0:1; and 

(e) a nucleotide sequence that hybridizes to a nucleotide 
sequence of (d). 

23. A vector comprising the nucleic acid of claim 21. 

24. A cell containing the vector of claim 23. 

* ♦ * « * 
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ABSTRACT 



Topoisomerase III polypeptides and DNA and RNA encod- 
ing such Topoisomerase III polypeptides and a procedure for 
producing such polypeptides by recombinant techniques is 
disclosed. Also disclosed are methods for utilizing such 
Topoisomerase 111 for the treatment of infection, particularly 
bacterial infections. Antagonists against such Topoi- 
somerase III and their use as a therapeutic to treat infections, 
particularly bacterial infections are also disclosed. Also 
disclosed arc diagnostic assays for detecting diseases related 
to the presence of Topoisomerase III nucleic acid sequences 
and the polypeptides in a host. Also disclosed are diagnostic 
assays for detecting polynucleotides encoding Staphylococ- 
cal Topoisomerase ill and for detecting the polypeptide in a 
host, 

25 Claims, 3 Drawing Sheets 
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FIGURE L 



TTCATTGTACTGTTGAGGAAGTTTATATAAGTATGATGCTGATTCATAATTCGAATGTTC 

AATGAACGATTTTATTTGTGTAATATCATATAACTGAACTATGCTCATGTCATTACCTCC 

GTACTTTTTGTTACTTTTATTATATAGTATTTCAACTGAAATGA 

TAACATGTTACAATACATTTAACACCATTGAATTTAAATCAAAGATTAGT(^ 

TUVGCACGTTCGAJUITAAAAGAACGTATGAGAAAG^^ 

GCTGAAAAACCATCAGTTGCAAGAGATATTGCrGATGCTTTA 

AATGGTTACTTTGAAAATAACCAATATATTGTCACGTG^ 

AATGCGACACCTGAACAATACGATAAAAATTTAAAGGAATGGCGATTAGAAGACCTTCC^ 

ATTATACCTAAATATATGAAAACTGTTGTTATTGGTAAAACAAGCAAACA^ 

GTAAAAGCGTTAATTTTAGATAATAAAGTGAAAGATATTATTATTGCAACAGAT^ 

CGAGAAGGTGAACTAGTTGCAAGATTGATTTTGGATAAAGTTGGTAAGAAAAAGCCAATC 

CGTCGATTATGGATTAGCTCAGTTACTAAAAAAGCTATTCAACAAGGT^ 

AAAGACGGTCgT<;^TATAAg(?ATTT(?TATTAT^ 

TGGATTGTTGGGATTAATGCAACGCGTGCACTAAC^ 

CTGGGACGTGTTCAGACACCAACGATTCAATTAGTAAATACACGACAACj^GAGATTAAT 

CAGTTCAAACCACAACAATACTTTACATTATQATTAACGGTAAAAGGGTTTGATT^ 

CTAGAATCAAATCAGCGATATACCAATAAAGAAACITTAGAACAGATGGTTAATAATT^ 

AAAAATGTCGATGGTAAGATTAAATCTGTTGCTACTAAACATAAGAAGTCGTATCCGCAA 

TCACTGTACAATTTAACAGATT TACAACAAQATATQTATAGA^^ 

AAAGAAACATTGAATACACITC AAAGCCT 

AGAACAGATTCAAACTATTTAA CAACTGATATGGTAGATACTAT^^ 

GCGACGATGGCAACAACATATAAAGACCAAGCACGCCCATTAATGTCTAAAAC^ 

TCAAAAATGTCGATATTTAATAATCAAAAAGTATCTGATC 
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Figure lA 



ACCATGCAATTATTCCTACAQAAGTGAGACCTGTCATGTCAGACTTAAGTAATAGAGAAT 

TAAAGTTATACGATATGATTGTCGAGCGTTTTTTAGAAGCTTTAATGCCTCCGCAC^ 

ATGACGCGATAACTGTAACTTTAGAGGTTGCAGGGCACACATTTGTTTTGAAAGAGAATG 

TAACAACTGTTTTAGGTTTTAAATCTATTAGACAAGGTGAATCTATTACAQAGATGCAAC 

AGCCTTTTTCAGAAGGCGATGAAGTGAAGATTTCAAAAAC;^ 

CAACACCTCCAGAATATTTTA ATGAAGGTTCGTTATTA^ 

ACTTTATTCAATTGAAGGATAAAAAATATGCGCAAACTTTAA^ 

GCACAGTTGCAACAAGGGCCGACATTATCGATAAATTATT^ 

CAAGAGACGGTAAAATTAAAGTAACGTCAAAAGGTAAACAAATATTAGAATTAGC^^ 

aagaattaacgtcgccacttttaactgcacaatgggaagaaaaattactttt;^ 

gtggtaaatatct^ggcgaaaacatttattaatgaaatgaaagattttacg 

taaatgggattaaaaatagtgatcgtaaatataaacacgataatttaacaaccacagaat 

gccau^cgtgtggtaaattcatgattaaagttaaaactaaaaatggtc^ 

gccaagatccatcttgtaagacgaaaaagaatgtacagcgcaaaacaaatgcaagatgtc 

ctvaactqtaaaaaqajattaacgttgtttggtaaagggaaag^ 

tttgtggacattctgaaacgcaagcacatatggatcagcgtatgaagtctaaatc^ 

gtaaagtatctcgtaaagaaatgaaaaagtatatgaataaaaatgaaggtttagacaata 

atccgtttaaagatgcattaaagaacttgaatttataga taaaatcgaacaaagttgaat 

CAGAAAAACGAAAAGTTCGCTTTTGGTATTGTTTTTTATTAAGAATGATAT^ 

AAGGTATTTTAAAAAAAGGAGCATCCATTOJTGAAAAACT^ 

GGAACAAACTTTAAAAGAGAAATCTTAGGCGGTATCACAACTTT(^ 

ATTTTAGCAGTTAACCCGCyUVGTTTTAAGT^ 

ATGAAAATGGACCAAGGTGCCATTTTTGTAGCGACTGCAT^ 

CTATTCATGGGACTAATAGCTAAATATCCAATCGCATTAGCACCAG^ 

TTC 
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FIGURE 2. 



MKSL I LAEKPS YARD I ADALQ INQKRNGYFENNQY I VTWALGHLVTNATPEQYDKNLKEW 

RLEDLP 1 1 PKYMKTWI GKTSKQFKT\nCAL ILDNKVKD 1 1 1 ATDAGREGELVARL I LDKV 

GNKKPIRRLWISSWKKAIQQGPKNLKDGRQYNDLYYAALARSEADWIVGINATRALTTK 

YDAQLSLGRVQTPTIQLVNTRQQEINQFKPQQYPTLSLTVKGFDFQLESNQRYTNKETLE 

Q^mJNLKNVDGKIKSVATKHKKSYPQSLYNLTDLQQDMYRRYKIGPKETLNTLQSLYERH 

KVVTYPRTDSNYLTTDMVDTMKER I QATMATTYKDQARPLMS KTFSSK^ 

HAI I PTEWPVMSDLSNRELKLYDM I VERFLEAJ^PPHEYDAITVTLEVAGHTFVLKENV 

TTVLGFKSIRQGESITEMQQPFSEGDEVKISKTNIREHETTPPEYFNEGSLLKAMENPQN 

FIQLKDKKYAQTLKQTGGIGTVATRADIIDKLFNMNAIESRDGKIKVTSKGKQILEI^^ 

ELTSPLLTAQWEEKLLL I ERGKYQAKTF INEMKDFTKDVWGI KNSDRKYKHDNLTTTEC 

PTCGKFM I KVKTKNGQMLVCQDPSCKTKKNVQRKTNARCPNCKKKLTLP^ 

CGHSETQAHMI^RMKSKSSGKVSRKEMKKYMNKNEGLDNNPFKDALKNI^ 
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TOPOISOMERASE III negative supercoils into DNA (DNA gyrase), relaxation of 

supcrcoilcd DNA, catenation (or decatenation) of duplex 

RELATED APPLICATIONS circles, knotting and unknotting of DNA. 

The family of type 1 topoisomerases comprises bacterial 

ThisApplication claims benefit of U.S. Provisional Appli- j topoisomerase 1, E. coli topoisomerase III, 5. cerevisiae 

cation Ser. No. 60/028,417, filed Oct. 15, 1996. topoisomerase III (R. A. Kim & J. C. Wang, /. Biot, Chem. 

267: 17178-17185 (1992), human topoisomerase 111 (Hanai 

FIELD OF THE INVENTION gj pf^c. Natl. Acad. Sci. 93:3653-3657 (1996)), the type 

TTiis invention rela.es, in part, to newly identified poly- } topoisomerase from chloroplasts that closely re^mbles 

nucleotides and polypeptides; variants and derivatives of lo bacterial eMymes(J. Siedleckiet al.,ArHcfacAc/*i?« 11: 

these polynucleotides and polypeptides; processes for mak- ^^p-lSSe (1983) thermoph.hc reverse gyrases (A. 

ing these polynucleotides and these polypeptides, and their Kik"ch., In DNA. Topology and Its Biological Effects (N. 

variants and derivatives; agonists and antagonists of the Cozzarelh and J. C. Wang, eds. Cold Spnng Harbor 

polvpeptides; and uses of these polynucleotides, Laboratory Press, New York, 1990, pp. 285-298); C. 

pol^ptides, variants, derivatives, agonists and antagonists. 15 Bouthier de la Tour et al., J. Bact. 173: 3921-3923 (1991), 

In particular, in these and in other regards, the invention thermophilic /). amylolyuais topoisomerase 111 (A. . S e- 

relates to polynucleotides and polypeptides of bacterial ^arev et al., J. Biol. Chem 266: 12321-12328 (1991), 

"ToDoisomerase III" nuclear topoisomerases 1 and closely related enzymes trom 

mitochondria and poxviruses (N. Osherolf, Phamac. I her. 

BACKGROUND OF THE INVENTION 20 41: 223-241 (1989)). With respect to the mechanism of 

catalysis these topoisomerases can be divided into two 
Among the more effective antibiotics are those that inter- groups. Group A consists of enzymes that require a divalent 
fere with common modes of bacterial gene expression, cation for activity, and form a transient covalent complex 
regulation or activity. Recently, the supercoiling of DNA had vvith the 5'-phosphoryl termini (prokaryotic type 1 
been suggested as a possible mode of virulence gene regu- topoisomerases, 5. cerevisiae topoisomerase III, and human 
lation. Local increases or decreases in DNA density, due to topoisomerase III). Group B includes type 1 topoisomerases 
supercoiling, have been associated with responses to various that do not require a divalent cation for activity, and bind 
environmental conditions such as, temperature, covalently to the 3'-phosphoryl termini (nuclear topoi- 
anaerobiosis, and osmolarity. Appropriate regulation of the somerases I, enzymes from mitochondria and poxviruses 
accessibility of groups of genes to components of the commonly called eukaryotic topoisomerases T). Type 1 
transcriptional apparatus by increasing or decreasing super- topoisomerases can carry out the following topological 
coiling ofspacially organized genes may represent an infect- reactions: they relax supercoiled DNA (except of reverse 
ing pathogen's effective response to such environmental gyrases), catenate (or decatenate) single-stranded circular 
conditions. Enzymes, such as DNA topoisomerases includ- DNAs or duplexes providing that at least one of the mol- 
ing type 1 topoisomerases and DNA gyrases, have been ecules contains a nick or gap, or interact with single- 
identified which function to effect the levels of DNA super- stranded circles to introduce topological knots (type 1-group 
coiling. Such enzymes represent useful targets against which a topoisomerases). Reverse gyrase, belonging to type 
to screen compounds as potential antibiotics. 1-group A topoisomerases, is the only topoisomerase shown 

DNA transformations performed by DNA topoisomerases to be able to introduce positive supercoils into cDNA. 
are accomplished by the cleavage of either a single strand or 40 Research on DNA topoisomerases has progressed from 
both strands. The unit change in the Linking number (Lk) DNA enzymology to developmental therapeutics. Bacterial 
resulting from such transformations is the best operational DNA topoisomerase II is an important therapeutic target of 
distinction between the two classes of topoisomerases (P. O. quinolone antibiotics; mammalian DNA topoisomerase II is 
Brown & N. R. Cozzarelli, Science 206:1081-1083 (1979)). the cellular target of many potent antitumor drugs (K. Drlica, 
The linking number (Lk) is the algebraic number of times 45 Microbiol. Rev. 48: 273-289 (1984) and Biochemistry 27: 
one strand crosses the surface stretched over the other 2253-2259 (1988); B. S. Glisson & W. E. Ross, P/ifln/iaco/. 
strand. DNA topoisomerases whose reactions proceed via a Ther. 32: 89-106 (1987); A. L. Bodley & L. F. Liu, 
transient single-stranded break and changing the Lk in steps Biotechnology 6: 1315-1319 (1988); L. F. Liu, Annu. Rev 
of one are classified as type 1, while enzymes whose Biocfiem, 58: 351-375 (1989)). These drugs, referred to as 
reactions proceed via double -stranded breaks and changing 5(j topoisomerase II poisons, interfere with the breakage- 
the Lk in steps of two are classified as type 2. rejoining reaction of type II topoisomerase by trapping a key 
Members of type 2 topoisomerase family include DNA covalent reaction intermediate, termed the cleavable com- 
gyrase, bacterial DNA topoisomerase IV,T-even phage DNA plex. Mammalian topoisomerase I is the cellular target of the 
topoisomerases, eukaryotic DNA topoisomerase II, and ther- antitumor drug topotecan (U.S. Pat. No. 5,004,758), which 
mophiUc topoisomerase II from Sulfolobus acidocaldarius 55 also traps the covalent reaction intermediate, 
(see: A. Kikuchi et al, Syst. Appl Microbiol 7: 72-78 As mentioned above, bacterial type I topoisomerases 
(1986); J. Kato et al., J. Biol. Chem. 267: 25676-25684 (topoisomerase I & III) are enzymes that alter DNA topol- 
(1992); W. M. Huang in DNA Topology and Its Biological ogy and are involved in a number of crucial cellular pro- 
Effects (N. R. Cozzarelli and J. C. Wang, eds.. Cold Spring cesses including replication, transcription and recombina- 
Harbor Laboratory Press, New York, 1990), pp. 265-284; 60 lion (Luttinger, A., Molecular Microbiol 15(4): 601-608 
T-S Hsieh in DNA Topok)gy and Us Biological Effects (N. (1995). These enzymes act by transiently breaking one 
R. Cozzarelli and J. C. Wang, eds., Cold Spring Harbor strand of DNA, passing a single or double strand of DNA 
Laboratory Press, New York, (1990), pp. 243-263)). The through the break and finally resealing the break. Cleavage 
coding sequences of a dozen or so type 2 enzymes have been of the DNA substrate forms a covalent linkage between a 
determined, and the data suggest that all these enzymes are 65 tyrosine residue of the enzyme and the 5' end of the DNA 
evolutionary and structurally related. Topological reactions chain at the cleavage site (Roca, J. A., TIBS 20:156-160 
catalyzed by type 2 topoisomerases include introduction of (1995). 
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Enzyme inhibition which leads to the stabilization of the In accordance with this aspect of the invention there are 

covalent-enzyme-DNA complex (cleavable complex), will provided novel polypeptides of Staphylococcal origin 

invoke chromosomal damage, and bacterial cell death. referred to herein as Topoisomerase III as well as 

Furthermore, this mechanism has the potential of leading to biologically, diagnostically or therapeutically useful frag- 
cell death by virtue of a single inhibition event. A small 5 ments thereof, as well as variants, derivatives and analogs of 

molecular weight inhibitor, which acts by stabilization of the ^^e foregoing and fragments thereof, 

cleavable complex may act on both topoisomerase I and III ^ .^^^^^-^^ ^^^.^^^ ^ 

because of the extensive ammo acid sequence similarity polypeptides, particularly bacterial Topoi- 

between them, particularly m the region of their active sites. , /. , , 1 Af .ul . 
Ihc likelihood of future Wgh level r^islance to such agents ^ . ^^"^^'^^ poly^ptides, that may be employed for the ra- 

arising from point mutation may therefore be low. P^^^^^ purposes, for exainple, to treat disease, including 

Inhibitors of type I topoisomerases, for example, those l^^/^^^^^^ by confernng host immumty against bacterial 

able to stabQize the protein in a covalent complex with DNA infections, such as Staphylococcal infections, 

would be lethal or inhibitory to the bacterium and thereby In accordance with yet a further aspect of the present 

have utility in anti-bacterial therapy. It is particularly pre- invention, there is provided the use of a polypeptide of the 

ferred to employ Staphylococcal genes and gene products as invention, in particular a fragment thereof, for therapeutic or 

targets for the development of antibiotics. The Staphylo- prophylactic purposes, for example, as an antibacterial agent 

cocci make up a medically important genera of microbes. or a vaccine. 

They are known to produce two types of disease, invasive accordance with another aspect of the present 

and toxigemc. Invasive infections are characterized gener- invention, there is provided the use of a polynucleotide of 

ally by abscess formation effecting both skin surfaces and 20 .^^^^^.^^ therapeutic or prophylactic purposes, in 

deep tissues. S. aureus is the second leadmg cause of . , ... f t- j r r 

bacteremia in cancer patients. Osteomyelitis, sepTic arthritis, P^^^^^"l^^ immumzation. 

septic thrombophlebitis and acute bacterial endocarditis are Among the particularly preferred embodiments of this 

also relatively common. There are at least three clinical aspect of the invention arc variants of Topoisomerase III 
conditions resulting from the toxigenic properties of Sta- 25 polypeptide encoded by naturally occurring alleles of the 

phylococci. The manifestation of these diseases result from Topoisomerase III gene. 

the actions of exotoxins as opposed to tissue invasion and ^ another object of the invention to provide a process 

bacteremia. These conditions include: Staphylococcal food producing the aforementioned polypeptides, polypeptide 

poisoning, scalded skin syndrome and toxic shock syn- fragments, variants and derivatives, fragments of the vari- 
^'"O'"*^- 30 ants and derivatives, and analogs of the foregoing. 

SUMMARY OF THE INVENTION In a preferred embodiment of this aspect of the invention 

Toward these ends, and others, it is an object of the there are provided methods for producing the aforemen- 
present invention to provide polypeptides, inter alia, that tioned Topoisomerase III polypeptides comprising culturing 
have been identified as novel Topoisomerase in by horaol- host cells having expressibly incorporated therein an 
ogy between the amino acid sequence set out in no. 2 (SEQ exogenously-derived Topoisomerase Ill-encodmg poly- 
ID NO: 2) and known amino acid sequences of other nucleotide under conditions for expression of Topoi- 
proteins such as Haemophilus influenzae topoisomerase DI. somerasc III in the host and then recovering the expressed 

It is a further object of the invention, moreover, to provide polypeptide . 

polynucleotides that encode Topoisomerase III, particularly In accordance with another object the invention there are 
polynucleotides that encode the polypeptide herein desig- ^ provided products, compositions, processes and methods 

nated bacterial Topoisomerase III. that utilize the aforementioned polypeptides and 

In a particularly preferred embodiment of this aspect of polynucleotides, inter aUa, for research, biological, clinical 

the invention the polynucleotide comprises the region ^^d therapeutic purposes. 

encoding Topoisomerase III in the sequence set out in FIG. In accordance with yet another aspect of the present 

1 (SEQ ID NO: 1). invention, there are provided inhibitors of such 

In another particularly preferred embodiment of the polypeptides, useful as antibacterial agents. In particular, 

present invention there is a novel Topoisomerase III protein there are provided antibodies against such polypeptides, 

from Staphylococcus aureus comprising the amino acid In accordance with certain preferred embodiments of this 

sequence of (SEQ ID NO: 2), or a fragment, analogue or and other aspects of the invention there are probes that 

derivative thereof. hybridize to bacterial Topoisomerase III sequences useful 

In accordance with this aspect of the present invention for detection of bacterial infection, 

there is provided an isolated nucleic acid molecule encoding In certain additional preferred embodiments of this aspect 

a mature polypeptide expressible by the Staphylococcus of the invention there are provided antibodies against Topoi- 
aureus polynucleotide contained in deposited strain NCIMB 55 somerase III polypeptides. In certain particularly preferred 

40771. embodiments in this regard, the antibodies are selective for 

In accordance with this aspect of the invention there are Staphylococcal Topoisomerase III. 
provided isolated nucleic acid molecules encoding Topoi- In accordance with another aspect of the present 
somerase III, particularly Staphylococcal Topoisomerase III, invention, there are provided Topoisomerase III agonists, 
including mRNAs, cDNAs, genomic DNAs and, in further tn Among preferred agonists are molecules that mimic Topoi- 
embodiments of this aspect of the invention, biologically, somerase III, that bind to Topoisomerase Ill-binding 
diagnostically, clinically or therapeutically useful variants, molecules, and that elicit or augment Topoisomerase Ill- 
analogs or derivatives thereof, or fragments thereof, includ- induced responses. Also among preferred agonists are mol- 
ing fragments of the variants, analogs and derivatives. ecules that interact with Topoisomerase III encoding genes 

Among the particularly preferred embodiments of this 65 or Topoisomerase III polypeptides, or with other modulators 

aspect of the invention are naturally occurring allelic vari- of Topoisomerase III activities, and thereby potentiate or 

ants of Topoisomerase III. augment an effect of Topoisomerase III or more than one 
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effect of Topoisomerase III and which are also preferably 
bacteriostatic or bactericidal. 

In accordance with yet another aspect of the present 
invention, there are provided Topoisomerase III antagonists. 
Among preferred antagonists are those which bind to Topoi- 
somerase III so as to inhibit the binding of topoisomerase 
Ill-binding molecules or to stabilize the complex formed 
between Topoisomerase III and Topoisomerase III binding 
molecule to prevent further biological activity arising from 
the Topoisomerase III. Also among preferred antagonists are 
molecules that bind to or interact with Topoisomerase III so 
as to inhibit an effect of Topoisomerase III or more than one 
effect of Topoisomerase III or which prevent expression of 
Topoisomerase III and which are also preferably bacterio- 
static or bactericidal. 

In a further aspect of the invention there are provided 
compositions comprising a Topoisomerase III polynucle- 
otide or a Topoisomerase III polypeptide for administration 
to cells in vitro, to c*ells ex vivo and to cells in vivo, or to a 
multicellular organism. In certain preferred embodiments of 
this aspect of the invention, the compositions comprise a 
Topoisomerase III polynucleotide for expression of a Topoi- 
somerase III polypeptide in a host organism to raise an 
immunological response, preferably to raise immunity in 
such host against Staphylococci or related organisms. 

Other objects, features, advantages and aspects of the 
present invention will become apparent to those of skill from 
the following description. It should be understood, however, 
that the following description and the specific examples, 
whUe indicating preferred embodiments of the invention, are 
given by way of illustration only. Various changes and 
modificalioas within the spirit and scope of the disclosed 
invention will become readily apparent to those skilled in 
the art from reading the following description and from 
reading the other parts of the present disclosure. 

BRIEF DESCRIPTION OF TIIE DRAWINGS 

The following drawings depict certain embodiments of 
the invention. They are illustrative only and do ooi limit the 
invention otherwise disclosed herein. 

FIG. 1 shows a polynucleotide sequence oi^ Staphylococ- 
cus aureus Topoisomerase III, comprising a sequence of 
Staphylococcia aureus Topoisomerase III gene and sur- 
rounding area (coding sequence underlined) (SEQ ID NO: 
1). 

FIG. 2 shows the amino acid sequence Staphylococcus 
aureus Topoisomerase III (SEQ ID NO: 2) deduced from the 
polynucleotide coding sequence of FIG. 1 (SEQ ID NO: 1). 

GLOSSARY 

The following illustrative explanations are provided to 
facilitate understanding of certain terms used frequently 
herein, particularly in the Examples, llie explanations are 
provided as a convenience and are not limitative of the 
invention. 

Topoisomerase Ill-BINDING MOLECULE, as used 
herein, refers to molecules or ions which bind or interact 
specifically with Topoist)merase im polypeptides or poly- 
nucleotides of the present invention, including, for example 
enzyme substrates, such as supercoiled DNA, cell mem- 
brane components and classical receptors. Binding between 
polypeptides of the invention and such molecules, including 
binding or interaction molecules may be exclusive to 
polypeptides of the invention, which is preferred, or it may 
be highly specific for polypeptides of the invention, which 
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is also preferred, or it may be highly specific to a group of 
proteins that includes polypeptides of the invention, which 
is preferred, or it may be specific to several groups of 
proteins at least one of which includes a polypeptide of the 

5 invention. Binding molecules also include antibodies and 
antibody-derived reagents that bind specifically to polypep- 
tides of the invention. 

GENETIC ELEMENT generally means a polynucleotide 
comprising a region that encodes a polypeptide or a poly- 
nucleotide region that regulates replication, transcription or 
translation or other processes important to expression of the 
polypeptide in a host cell, or a polynucleotide comprising 
both a region that encodes a polypeptide and a region 
operably linked thereto that regulates expre.ssion. Genetic 

j5 elements may be comprised within a vector that replicates as 
an epLsomal element; that is, as a molecule physically 
independent of the host cell genome. They may be com- 
prised within plasmids. Genetic elements also may be com- 
prised within a host cell genome; not in their natural state 
but, rather, following manipulation such as isolation, cloning 
and introduction into a host cell in the form of purified DNA 
or in a vector, among others. 

HOST CELL is a cell which has been transformed or 
transfcctcd, or is capable of transformation or transfcction 

25 by an exogenous polynucleotide sequence. 

IDEN'llY or SIMILARIIT, as known in the art, are 
relationships between two polypeptide sequences or two 
polynucleotide sequences, as determined by comparing the 
sequences. In the art, identity also means the degree of 

30 sequence relatedness between two polypeptide or two poly- 
nucleotide sequences as determined by the match between 
two strings of such sequences. Both identity and similarity 
can be readily calculated (Computational Molecular 
Biology y Lesk, A. M., ed., Oxford University Press, New 

35 York, 1988; Diocomputing: Informatics and Genome 
Projects y Smith, D. W., ed., Academic Press, New York, 
1993; Computer Analysis of Sequence Data, Part I, Griffin, 
A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 
1994; Sequence Analysis in Molecular Biology, von Heinje, 

40 G., Academic Press, 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux, J., eds., M Stockton Press, New 
York, 1991). While there exist a number of methods to 
measure identity and similarity between two polynucleotide 
or two polypeptide sequences, both terms are well known to 

45 skilled artisans (Sequence Analysis in Molecular Biology, 
von Heinje, G., Academic Press, 1987; Sequence Analysis 
Primer, Gribskov, M. and Devereux, J., eds., M Stockton 
Press, New York, 1991; and Carillo, H., and Lipman, D., 
SIAM J, Applied Math., 48: 1073 (1988). Methods com- 

50 monly employed to determine identity or similarity between 
two sequences include, but are not limited to those disclosed 
in Carillo, H., and Lipman, D., SIAM J, Applied Math,, 
48:1073 (1 988). Preferred methods to determine identity are 
designed to give the largest match between the two 

55 sequences tested. Methods to determine identity and simi- 
larity are codified in computer programs. Preferred com- 
puter program methods to determine identity and similarity 
between two sequences include, but are not limited to, GCG 
program package (Devereux, J., et al.. Nucleic Acids 

60 Research 12(1): 387 (1984)), BLASTP, BLASTN, and 
FASTA(Atschul, S. R et al., J. Mol. Biol. 215: 403 (1990)). 

ISOLATED means ahered "by the hand of man" from its 
natural state; i.e., that, if it occurs in nature, it has been 
changed or removed from its original environment, or both. 

65 For example, a naturally occurring polynucleotide or a 
polypeptide naturally present in a living organism in its 
natural state is not '^isolated," but the same polynucleotide or 



6,025,156 



8 



polypeptide separated from the coexisting materials of its 
natural state is "isolated", as the term is employed herein. 
For example, with respect to polynucleotides, the term 
isolated means that it is separated from the chromosome and 
cell in which it naturally occurs. As part of or following 5 
isolation, such polynucleotides can be joined to other 
polynucleotides, such as DNAs, for mutagenesis, to form 
fusion proteins, and for propagation or expression in a host, 
for instance. The isolated polynucleotides, alone or joined to 
other polynucleotides such as vectors, can be introduced into ^ 
host cells, in culture or in whole organisms. Introduced into 
host cells in culture or in whole organisms, such DNAs still 
would be isolated, as the term is used herein, because they 
would not be in their naturally occurring form or environ- 
ment. Similarly, the polynucleotides and polypeptides may 15 
occur in a composition, such as a media formulations, 
solutions for introduction of polynucleotides or 
polypeptides, for example, into cells, compositions or solu- 
tions for chemical or enzymatic reactions, for instance, 
which are not naturally occurring compositions, and, therein 20 
remain isolated polynucleotides or polypeptides within the 
meaning of that term as it is employed herein. 

POLYNUCLEOTIDE(S) generally refers to any polyri- 
bonucleotide or polydeoxribonucleotide, which may be 
unmodified RNAor DNA or modified RNAor DNA. Thus, 25 
for instance, polynucleotides as used herein refers to, among 
others, single- and double-stranded DNA, DNA that is a 
mixture of single- and double-stranded regions or single-, 
double- and triple-stranded regions, single- and double- 
stranded RNA, and RNA that is mixture of single- and 30 
doublc-strandcd regions, hybrid molecules comprising DNA 
and RNA that may be single-stranded or, more typically, 
double -stranded, or triple-stranded, or a mixture of single- 
and double -stranded regions. In addition, polynucleotide as 
used herein refers to triple -stranded regions comprising 35 
RNA or DNA or both RNA and DNA. The strands in such 
regions may be from the same molecule or from different 
molecules. The regions may include all of one or more of the 
molecules, but more typically involve only a region of some 
of the molecules. One of the molecules of a triple-helical 40 
region often is an oligonucleotide. As used herein, the term 
polynucleotide includes DNAs or RNAs as described above 
that contain one or more modified bases. Thus, DNAs or 
RNAs with backbones modified for stability or for other 
reasons are "polynucleotides" as that term is intended 45 
herein. Moreover, DNAs or RNAs comprising unusual 
bases, such as inosine, or modified bases, such as tritylated 
bases, to name just two examples, are polynucleotides as the 
term is used herein. It will be appreciated that a great variety 
of modifications have been made to DNA and RNA that 50 
serve many useful purposes known to those of skill in the art. 
ITie term polynucleotide as it is employed herein embraces 
such chemically, enzymatically or metabolically modified 
forms of polynucleotides, as well as the chemical forms of 
DNA and RNA characteristic of viruses and cells, including 55 
inter alia, simple and complex cells. The term 
polynucleotide(s) embrace short polynucleotides often 
referred as oligonucleotides. 

POLYPEFri'DES, as used herein, includes all polypep- 
tides as described below. The basic structure of polypeptides 60 
is well known and has been described in innumerable 
textbooks and other publications in the art. In this context, 
the term is used herein to refer to any peptide or protein 
comprising two or more amino acids joined to each other in 
a linear chain by peptide bonds. As used herein, the term 65 
refers to both short chains, which also commonly are 
referred to in the art as peptides, oligopeptides and 



oligomers, for example, and to longer chains, which gener- 
ally are referred to in the art as proteins, of which there are 
many types. It will be appreciated that polypeptides often 
contain amino acids other than the 20 amino acids com- 
monly referred to as the 20 naturally occurring amino acids, 
and that many amino acids, including the terminal amino 
acids, may be modified in a given polypeptide, either by 
natural processes, such as processing and other post- 
translational modifications, but also by chemical modifica- 
tion techniques which are well known to the art. Even the 
common modifications that occur naturally in polypeptides 
are too numerous to list exhaustively here, but they arc well 
described in basic texts and in more detailed monographs, as 
well as in a voluminous research literature, and they are well 
known to those of skill in the art. Among the known 
modifications which may be present in polypeptides of the 
present are, to name an illustrative few, acetylation, 
acylation, ADP-ribosylation, amidation, covalent attach- 
ment of flavin, covalent attachment of a heme moiety, 
covalent attachment of a nucleotide or nucleotide derivative, 
covalent attachment of a lipid or lipid derivative, covalent 
attachment of phosphotidylinositol, cross-linking, 
cyclization, disulfide bond formation, demethylation, for- 
mation of covalent cross-links, formation of cystine, forma- 
tion of pyroglutamate, formylation, gamma-carboxylation. 
glycosylation, GPI anchor formation, hydroxylation, 
iodination, methylation, myristoylalion, oxidation, pro- 
teolytic processing, phosphorylation, prenylation, 
racemization, selenoylation, sulfation, transfer-RNA medi- 
ated addition of amino acids to proteins such as arginylation, 
and ubiquitination. Such modifications are well known to 
those of skill and have been described in great detail in the 
scientific literature. Several particularly common 
modifications, glycosylation, lipid attachment, sulfation, 
gamma-carboxylation of glutamic acid residues, hydroxyla- 
tion and ADP-ribosylation, for instance, are described in 
most basic texts, such as, for instance PROTEINS — 
STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., 
T. E. Creighton, W. II. Freeman and Company, New York 
(1993). Many detailed reviews are available on this subject, 
such as, for example, those provided by Wold, R, Posltrans- 
lational Protein Modifications: Perspectives and Prospects, 
pgs. 1-12 in POSTTRANSLATIONAL COVALENT MODI- 
FICATION OF PROTEINS, B. C Johnson, Ed., Academic 
Press, New York (1983); Seifter et ah, Meih. EnzymoL 
182:626-646 (1990) and RaUan et al.. Protein Synthesis: 
Pasatranslational Modifications and Aging, i4/zrt. N.Y. Acad. 
ScL 663: 4^62 (1992). It will be appreciated, as is well 
known and as noted above, that polypeptides are not always 
entirely linear. For instance, polypeptides may be branched 
as a result of ubiquitination, and they may be circular, with 
or without branching, generally as a resuh of posttranslation 
events, including natural processing event and events 
brought about by human manipulation which do not occur 
naturally. Circular, branched and branched circular polypep- 
tides may be synthesized by non-translation natural process 
and by entirely synthetic methods, as well. Modifications 
can occur anywhere in a polypeptide, including the peptide 
backbone, the amino acid side-chains and the amino or 
carboxyl termini. In fact, blockage of the amino or carboxyl 
group in a polypeptide, or both, by a covalent modification, 
is common in naturally occurring and synthetic polypeptides 
and such modifications may be present in polypeptides of 
the present invention, as well. For instance, the amino 
terminal residue of polypeptides made in E. coli or other 
cells, prior to proteolytic processing, almost invariably will 
be N-formylmethionine. During post-translational modifi- 
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cation of the peptide, a methioniae residue at the NHj- 53% similar at the amino acid level, and 66% identical at the 

terminus may be deleted. Accordingly, this invention con- nucleotide level. These homology determinations were 

templates the use of both the methionine-containing and the made using the Gentics Cbmputer Group BESTFIT pro- 

methionineless amino terminal variants of the protein of the gram. 

invention. The modifications that occur in a polypeptide 5 The invention relates especially to Staphylococcal Topoi- 

often will be a function of how it is made. For polypeptides somerase III having the nucleotide and amino acid 

made by expressing a cloned gene in a host, for instance, the sequences set out in FIG, 1 (SEQ ID NO: 1) and FIG. 2 

nature and extent of the modifications in large part will be (SEQ ID NO: 2), and to the Topoisomerase III nucleotide 

determined by the host cell postlranslational modification and amino acid sequences of the DNA isolatable from 

capacity and the modification signals present in the polypep- lO Deposit No. NCIMB 40771, which is herein referred to as 

tide amino acid sequence. For instance, as is well known, "the deposited organism" or as the "DNA of the deposited 

glycosylation often does not occur in bacterial hosts such as, organism." It will be appreciated that the nucleotide and 

for example, E, coli. Accordingly, when glycosylation is amino acid sequences set out in FIG. 1 (SEQ ID NO: 1) and 

desired, a polypeptide should be expressed in a glycosylat- FIG. 2 (SEQ ID NO: 2) were obtained by sequencing the 

ing host, generally a eukaryotic cell. Insect cells often carry 15 DNA of the deposited organism. Hence, the sequence of the 

out the same postlranslational glycosylations as do mam- deposited clone is controlling as to any discrepancies 

malian cells and, for this reason, insect cell expression between it (and the sequence it encodes) and the sequences 

systems have been developed to express efficiently mam- of FIG. 1 (SEQ ID NO: 1) and FIG. 2 (SEQ ID NO: 2). 

malian proteins having native patterns of glycosylation. Polynudeotides 

Similar considerations apply to other modifications. It will 20 In accordance with one aspect of the present invention, 

be appreciated that the same type of modification may be there are provided isolated polynucleotides which encode 

present in the same or varying degree at several sites in a the Staphylococcal Topoisomerase III polypeptide having 

given polypeptide. Also, a given polypeptide may contain tjie deduced amnino acid sequence of FIG. 2 (SEQ ID NO: 

many types of modifications. In general, as used herein, the ^2). 

term polypeptide encompasses all such modifications, par- 25 Using the information provided herein, such as the poly- 

ticularly those that are present in polypeptides synthesized nucleotide sequence set out in FIG, 1 (SEQ ID NO: 1), a 

by expressing a polynucleotide in a host cell. polynucleotide of the present invention encoding Topoi- 

VARIANT(S) of polynucleotides or polypeptides, as the somerase III polypeptide may be obtained using standard 
term is used herein, are polynucleotides or polypeptides that cloning and screening procedures. To obtain the polynucle- 
differ from a reference polynucleotide or polypeptide, otide encoding the protein using the DNA sequence given in 
respectively. Variants in this sense are described below and SEQ ID NO: 1 typically a library of clones of chromosomal 
elsewhere in the present disclosure in greater detail. With l^NA of S. aureivi WCUH 29 in E. coli or some other 
reference to polynucleotides, generally, differences are Hm- suitable host is probed with a radiolabelled oligonucleotide, 
ited such that the nucleotide sequences of the reference and preferably a 17 mer or longer, derived from the .<;equence of 
the variant are closely similar overall and, in many regions, '^^ PIG. 1. Clones carrying DNA identical to that of the probe 
identical. As noted below, changes in the nucleotide can then be distinguished using high stringency washes. By 
sequenceofthevariantmay be silent. That i.s, they may not sequencing the individual clones thus idcnrificd with 
alter the amino acids encoded by the polynucleotide. Where sequencing primers designed from the original sequence it is 
alterations are Umited to silent changes of this type, a variant then possible to extend the sequence in both directions to 
will encode a polypeptide with the same amino acid 40 determine the full gene sequence. Conveniently such 
sequence as the reference. Also as noted below, changes in sequencing is performed using denatured double stranded 
the nucleotide sequence of the variant may alter the amino DNA prepared from a plasmid clone. Suitable techniques are 
acid sequence of a polypeptide encoded by the reference described by Maniatis, T, Fritsch, E. F. and Sambrook, J. in 
polynucleotide. Such nucleotide changes may result in MOLECULAR CLONING, A Laboratory Manual (2nd edi- 
amino acid substitutions, additions, deletions, fusions and 45 tion 1989 Cold Spring Harbor Laboratory, see Screening By 
truncations in the polypeptide encoded by the reference Hybridization 1.90 and Sequencing Denatured Double- 
sequence, as discussed below. With reference to polypep- Stranded DNA Templates 13.70). Illustrative of the 
tides generally, differences are limited so that the sequences invention, the polynucleotide set out in FIG. 1 (SEQ ID NO: 
of the reference and the variant are closely similar overall 1) was discovered in a DNA library derived from Staphy- 
and, in many region, identical. A variant and reference 50 /i>coccitf iH<r«is NCIMB 40771 as described in Example I. 
polypeptide may difter in amino acid sequence by one or Topoisomerase III of the invention is structurally related 
more substitution.s, additions, deletions, fusions and to other proteins of the bacterial Topoisomerase III family, 
truncations, which may be present in any combination. as shown by comparing the sequence encoding Topoi- 
somerase III from the deposited clone with that of sequence 
DESCRIPTION OF THE INVENTION 55 reported in the Uterature. A preferred DNA sequence is set 

The present invention relates to novel Topoisomerase III out in FIG. 1(SEQ ID NO: 1). It contains an open reading 

polypeptides and polynucleotides encoding same, among frame encoding a protein of about 711 amino acid residues, 

other things, as described in greater detail below. In llie protein exhibits greatest homology to Haemophilus 

particular, the invention relates to polypeptides and poly- influenze topoisomerase m protein among known proteins, 

nucleotides of a novel Topoisomerase III gene o\ Staphylo- 60 Polynucleotides of the present invention may be in the 

coccus aureus, which is related by amino acid sequence form of RNA, such as MRNA, or in the form of DNA, 

homology to, for example, Haemophilus influenzae topoi- including, for instance, cDNA and genomic DNA obtained 

somerase III protein. The closest relatives to Staphylococcus by cloning or produced by chemical synthetic techniques or 

topoisomerase III are the Haemophilus influenzae topoi- by a combination thereof. The DNA may be double-stranded 

somerase III, which is 33% identical and 52% similar at the 65 or single-stranded. Single-stranded DNA may be the coding 

amino acid level, and 57% identical at the nucleotide level, strand, also known as the sense strand, or it may be the 

and E. coli topoisomerase III, which is 30% identical and non-coding strand, also referred to as the anti-sense strand. 
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The coding sequence which encodes the polypeptide may Among the partictilarly preferred embodiments of the 

be identical to the coding sequence of the polynucleotide invention in this regard are polynucleotides encoding 

shown in FIG. 1 (SEQ ID NO: 1). It also may be a polypeptides having the amino acid sequence of Staphylo- 

polynucleotide with a different sequence, which, as a result coccal Topoisomerase III set out in FIG. 2 (SEQ ID NO: 2); 

of the redundancy (degeneracy) of the genetic code, encodes 5 variants, analogs, derivatives and fragments thereof, 

the polypeptide of FIG. 2 (SEQ ID NO: 2). Further particularly preferred in this regard are polynucle- 

Polynucleotides of the present invention which encode the ^tides encoding Topoisomerase III variants, analogs, deriva- 

polypeptide of FIG. 2 (SEQ ID NO: 2) may mclude, but are ^^^^^ fragments, and variants, analogs and derivatives of 

not limited to the codmg sequence for the mature fragments, which have the amino acid sequence of 

po ypeptide, by itself; the coding sequence for the mature staphylococcal Topoisomerase III polypeptide of FIG. 2 

polypeptide and additional coding sequences, such as those ^ \ J / / . 

encoding a leader or secretory sequence, such as a pre-, or ^ . ^ . . , . , ' u . j ^ 1 / ^ 

pro- or prepro- protein sequence; the coding sequence of the ^« ] ammo acid residues are substituted deleted 

mature polypeptide, with or without the aforementioned ^^^ed, m any combmation. Especially preferred among 

additional coding sequences, together with additional, non- ^^ese are silent substitutions, additions and deletions, which 

coding sequences, including for example, but not limited to do not alter the properties and activities of Topoisomerase 

non-coding 5' and 3' sequences, such as the transcribed, HI- Also especiaUy preferred in this regard are conservative 

non-translated sequences that play a role in transcription substitutions. Most highly preferred are polynucleotides 

(including termination signals, for example), ribosomc encoding polypeptides having the amino acid sequence of 

binding, mRNA stabihty elements, and additional coding FIG. 2 (SEQ ID NO: 2), without substitutions, 

sequence which encode additional amino acids, such as 20 Further preferred embodiments of the invention are poly- 

those which provide additional functionalities. The DNA nucleotides that are at least 70% identical to a polynucle- 

may also comprise a promoter region which functions to otide encoding Topoisomerase III polypeptide having the 

direct the transcription of the mRNA encoding the Topoi- amino acid sequence set out in FIG. 2 (SEQ ID NO: 2), and 

somcrase III of this invention. Such promoter may be polynucleotides which are complementary to such poly- 

independently useful to direct the transcription of heterolo- 25 nucleotides. Alternatively, most highly preferred are poly- 

gous gene in recombinant expression system. Furthermore, nucleotides that comprise a region that is at least 80% 

the polypeptide may be fused to a marker sequence, such as identical to a polynucleotide encoding Topoisomerase III 

apeptide,whichfacihtatespurificationofthefusedpolypep. polypeptide of the Staphylococcus aureus DNA of the 

tide. In certam embodiments of this aspect of the inventio^^ deposited clone and polynucleotides complementary 

the marker sequence is a hexa-histidme peptide, such as the ^ , , . , 1 1 * j 41 * cJ^m *j *• 1 

1 1 *u * //-k ? \ iu 30 thereto. In this regard, polynucleotides at least 90% identical 

taii provided m the pQb vector (Qiagen, Inc.), amoni^ others, . 1- / 1 . i .1 

many of which are commercially available. As described in particularly preferred, and among these 

Gentz et al., Proc. Nat'l Acad, Scl, USA 86: 821-824 particulariy preferred polynucleotides, those with at least 

(1989), for instance, hexa-histidine provides for convenient ^5% are especially preferred. Furthermore, those with at 

purification of the fusion protein. The HA tag may also be ^ea.st 97% are highly preferred among those with at least 

used to create fusion proteins and corresponds to an epitope ^5%, and among these those with at least 98% and at least 

derived of influenza hemagglutinin protein, which has been 99% are particularly highly preferred, with at least 99% 

described by Wilson et al., Cell yi: IGl (1984), for instance. being the more preferred. 

In accordance with the foregoing, the term "polynucle- Particularly preferred embodiments in this respect, 

otide encoding a polypeptide" as used herein encompasses moreover, are polynucleotides which encode polypeptides 

polynucleotides which include a sequence encoding a 40 which retain substantially the same biological function or 

polypeptide of the present invention, particularly bacterial, activity as the mature polypeptide encoded by the DNA of 

and more particularly Staphylococcus aureus Topoi- FIG. 1 (SEQ ID NO: 1). 

somerase III having the amino acid sequence set out in FIG. The present invention further relates to polynucleotides 
2 (SEQ ID NO: 2). The term encompasses polynucleotides that hybridize to the herein above -described sequences. In 
that include a single continuous region or discontinuous 45 this regard, the present invention especially relates to poly- 
regions encoding the polypeptide (for example, interrupted nucleotides which hybridize under stringent conditions to 
by integrated phage or insertion sequence or editing) the herein above-described polynucleotides. As herein used, 
together with additional regions, that also may contain the term "stringent conditions" means hybridization will 
coding and/or non-coding sequences. occur only if there is at least 95% and preferably at least 97% 

The present invention further relates to variants of the 50 identity between the sequences, 
herein above described polynucleotides which encode for As discussed additionally herein regarding polynucleotide 
fragments, analogs and derivatives of the polypeptide hav- assays of the invention, for instance, polynucleotides of the 
ing the deduced amino acid sequence of FIG. 2 (SEQ ID invention as discussed above, may be used as a hybridization 
NO: 2). A variant of the polynucleotide may be a namrally probe for RNA, cDNA and genomic DNA to isolate full- 
occurring variant such as a naturally occurring allelic 55 length cDNAs and genomnic clones encoding Topoi- 
variant, or it may be a variant that is not known to occur somerase III and to isolate cDNA and genomic clones of 
naturally. Such non-naturally occurring variants of the poly- other genes that have a high sequence similarity to the 
nucleotide may be made by mutagenesis techniques, includ- Topoisomerase III gene. Such probes generally will com- 
ing those applied to polynucleotides, cells or organisms. prise at least 15 bases. Preferably, such probes will have al 

Among variants in this regard are variants that difler from 60 least 30 bases and may have at least 50 bases. Particulariy 

the aforementioned polynucleotides by nucleotide preferred probes will have at least 30 bases and will have 50 

substitutions, deletions or. additions. The substitutions may bases or less. 

involve one or more nucleotides. The variants may be For example, the coding region of the TopoLsomerase III 

ahered in coding or non-coding regions or both. Alterations gene may be isolated by screening using the known DNA 

in the coding regions may produce conservative or non- 65 sequence to synthesize an oligonucleotide probe. A labeled 

conservative amino acid substitutions, deletions or addi- oligonucleotide having a sequence complementary to that of 

tions. a gene of the present invention is then used to screen a 
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library of cDNA, genomic DNA or mRNA to determine of FIG. 2 (SEQ ID NO: 2), means a polypeptide which 

which members of the library to which the probe hybridizes. retains essentially the same biolog;ical function or activity as 

The polynucleotides and polypeptides of the present such polypeptide. Fragments derivatives and analogs that 
invention may be employed as research reagents and mate- retain at least 90% of the activity of the native Topoi- 
rials for discovery of treatments of and diagnostics for 5 somerase III are preferred. Fragments derivatives and ana- 
disease, particularly human disease, as further discussed logs that retain at least 95% of the activity of the native 
herein relating to polynucleotide assays. 'Ibpoisomerase 111 are preferred ilius, an analog includes a 

The polynucleotides of the invention that are proprotein which can be activated by cleavage of the pro- 

oligonucleotides, derived from the sequences (SEQ ID NO: protein portion to produce an active mature polypeptide. 

1) may be used as PGR primers in the process herein lo The polypeptide of the present invention may be a recom- 

dcscribcd to determine whether or not the Staphylococcus binant polypeptide, a natural polypeptide or a synthetic 

aureus genes identified herein in whole or in part are polypeptide. In certain preferred embodiments it is a recom- 

transcribed in infected tissue. It is recognized that such binant polypeptide. 

sequences will also have utility in diagnosis of the stage of The fragment, derivative or analog of the polypeptide of 

infection and type of infection the pathogen has attained, is FIG. 2 (SEQ ID NO: 2) may be (i) one in which one or more 

The polynucleotides may encode a polypeptide which is of the amino acid residues are substituted with a conserved 
the mature protein plus additional amino or carboxyl- or non-conserved amino acid residue (preferably a con- 
terminal amino acids, or amino acids interior to the mature served amino acid residue) and such substituted amino acid 
polypeptide (when the mamre form has more than one residue may or may not be one encoded by the genetic code, 
polypeptide chain, for instance). Such sequences may play a 20 or (ii) one in which one or more of the amino acid residues 
role in processing of a protein from precursor to a mature includes a substituent group, or (iii) one in which the mamre 
form, may allow protein transport, may lengthen or shorten polypeptide is fiised with another compound, such as a 
protein half-life or may facilitate manipulation of a protein compound to increase the half-life of the polypeptide (for 
for assay or production, among other things. As generally is example, polyethylene glycol), or (iv) one in which the 
the case, in vivo, the additional amino acids may be pro- 25 additional amino acids are fused to the mature polypeptide, 
cessed away from the mamre protein by cellular enzymes. such as a leader or secretory sequence or a sequence which 

A precursor protein, having the mature form of the is employed for purification of the mature polypeptide or a 
polypeptide fused to one or more prosequences may be an proprotein sequence. Such fragments, derivatives and ana- 
inactive form of the polypeptide. When prosequences are logs are deemed to be obtained by those of ordinary skill in 
removed such inactive precursors generally are activated. 30 the art, from the teachings herein. 

Some or all of the prosequences may be removed before Among the particularly preferred embodiments of the 

activation. Generally, such precursors are called proproteins. invention in this regard are polypeptides having the amino 

In sura, a polynucleotide of the present invention may acid sequence of Staphylococcal Ibpoisomerase III set out 

encode a mature protein, a mature protein plus a leader in FIG. 2 (SEQ ID NO: 2), variants, analogs, derivatives and 

sequence (which may be referred to as a preprotein), a 35 fragments thereof, and variants, analogs and derivatives of 

precursor of a mature protein having one or more prose- the fragments. 

qucnccs which are not the leader sequences of a preprotein, Among preferred variants arc those that vary from a 

or a prcproprotein, which is a precursor to a proprotein, reference by conservative amino acid substitutions. Such 

having a leader sequence and one or more prosequences, substitutions are those that substitute a given amino acid in 

which generally are removed during processing steps that 40 a polypeptide by another amino acid of like characteristics, 

produce active and mature forms of the polypeptide. Typically seen as conservative substitutions are the 

Deposited materials replacements, one for another, among the aliphatic amino 

Staphylococcus aureus WCUH 29 was deposited at the acids Ala, Val, Lxu and He; interchange of the hydroxyl 

National Collection of Industrial and Marine Bacteria Ltd. residues Ser and Thr, exchange of the acidic residues Asp 

(NCIMB), 23 St. Machar Drive, Aberdeen, Scotland under 45 and Glu, substitution between the amide residues Asn and 

number NCIMB 40771 on Sep. 11, 1995. Gin, exchange of the basic residues Lys and Arg and 

'Ilie deposit has been made under the terms of the Budap- replacements among the aromatic residues Phe, Tyr. 

est Treaty on the International Recognition of the Deposit of Further particularly preferred in this regard are variants, 

Micro-organisms tor Purposes of Patent Procedure. The analogs, derivatives and fragments, and variants, analogs 

strain will be irrevocably and without restriction or condi- 50 and derivatives of the fragments, having the amino acid 

tion released to the public upon the issuance of a patent. The sequence of the Topoisomerasc III polypeptide of FIG. 2 

deposit is provided merely as convenience to those of skill (SEQ ID NO: 2), in which several, a few, 5 to 10, 1 to 5, 1 

in the art and is not an admission that a deposit is required to 3, 2, 1 or no amino acid residues are substituted, deleted 

for enablement, such as that required under 35 U.S.C. §112. or added, in any combination. Especially preferred among 

The sequence of the polynucleotides contained in the 55 these are silent substitutions, additions and deletions, which 

deposited material, as well as the amino acid sequence of the do not aher the properties and activities of the Topoi- 

polypeptide encoded thereby, are controlling in the event of somerase III. Also especially preferred in this regard are 

any conflict with any description of sequences herein. conservative substitutions. Most highly preferred are 

A license may be required to make, use or sell the polypeptides having the amino acid sequence of FIG, 2 

deposited materials, and no such license is hereby granted. 60 (SEQ ID NO: 2) without substitutions. 

Polypeptides The polypeptides and polynucleotides of the present 

The present invention further relates to a bacterial Topoi- invention are preferably provided in an isolated form, and 

somerase III polypeptide that has the deduced amino acid preferably are purified to homogeneity, 

sequence of FIG. 2 (SEQ ID NO: 2). The polypeptides of the present invention include the 

The invention also relates to fragments, analogs and 65 polypeptide of FIG. 2 (SEQ ID NO: 2), in particular the 

derivatives of these polypeptides. The terms "fragment," mature polypeptide as well as polypeptides which have at 

"derivative" and "analog" when referring to the polypeptide least 80% identity to the polypeptide of FIG. 2 (SEQ ID NO: 
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2) and preferably at least 90% similarity (more preferably at 
least 90% identity) to the polypeptide of FIG. 2 (SEO ID 
NO: 2) and more preferably at least 95% similarity; and still 
more preferably at least 95% identity to the polypeptide of 
FIG. 2 (SEQ ID NO: 2) and also include portions of such 
polypeptides with such portion of the polypeptide generally 
containing at least 30 contiguous amino acids and more 
preferably at least 50 contiguous amino acids. 

Fragments or portions of the polypeptides of the present 
invention may be employed for producing the corresponding 
full-length polypeptide by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing 
the full-length polypeptides. Fragments or portions of the 
polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present inven- 
tion. 

Fragments 

Also among preferred embodiments of this aspect of the 
present invention are polypeptides comprising fragments of 
Topoisomerase III, most particularly fragments of Topoi- 
somerase III having the amino acid set out in FIG. 2 (SEQ 
ID NO: 2), and framnts of variants and derivatives of the 
Topoisomerase III of FIG. 2 (SEQ ID NO: 2). 

In this regard a fragment is a polypeptide having an amino 
acid sequence that entirely is the same as part but not all of 
the amino acid sequence of the aforementioned Topoi- 
somerase III polypeptides and variants or derivatives 
thereof. 

Such fragments may be "free-standing," i.e., not part of or 
fused to other amino acids or polypeptides, or they may be 
comprised within a larger polypeptide of which they form a 
part or region. When comprised within a larger polypeptide, 
the presently discussed fragments most preferably form a 
single continuous region. However, several fragments, may 
be comprised within a single larger polypeptide. For 
instance, certain preferred embodiments relate lo a fragment 
of a Topoisomerase III polypeptide of the present comprised 
within a precursor polypeptide designed for expression in a 
host and having heterologous pre and pro-polypeptide 
regions fused to the amino terminus of the Topoisomerase III 
fragment and an additional region fused to the carboxyl 
terminus of the firagment. Therefore, fragments in one aspect 
of the meaning intended herein, refers to the portion or 
portions of a fusion polypeptide or fusion protein derived 
from Topoisomerase III. 

Representative examples of polypeptide fragments of the 
invention, include, for example, may be mentioned those 
which have from about 5-15, 10-20, 15^, 30-55, 41-75, 
41-80, 41-90, 50-100, 75-100, 90-115, 100-125, and 
110-140, 120-150, 200-300, 1-175 or 1-711 amino acids 
long. 

In this context about includes the particularly recited 
range and ranges larger or smaller by several, a few, 5, 4, 3, 
2 or 1 amino acid at either extreme or at both extremes. 

Among especially preferred fragments of the invention 
are truncation mutants of Topoisomerase III. Truncation 
mutants include Topoisomerase III polypeptides having the 
amino acid sequence of FIG. 2 (SEQ ID NO: 2), or of 
variants or derivatives thereof, except for deletion of a 
continuous series of residues (that is, a continuous region, 
part or portion) that includes the amino terminus, or a 
continuous series of residues that includes the carboxyl 
terminus or, as in double truncation mutants, deletion of two 
continuous series of residues, one including the amino 
terminus and one including the carboxyl terminus. Frag- 
ments having the size ranges set out above also are preferred 
embodiments of truncation fragments, which are especially 
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preferred among fragments generally. Degradation fonms of 
the polypeptides of the invention in a host cell are also 
preferred. 

Also preferred in this aspect of the invention are frag- 
5 ments characterized by structural or functional attributes of 
Topoisomerase III. Preferred embodiments of the invention 
in this regard include fragments that comprise alpha-helix 
and alpha-helix forming regions ("alpha -regions"), beta- 
sheet and beta-sheet-forming regions ("beta-regions"),, turn 
10 and turn-forming regions ("turn-regions"), coil and coil- 
forming regions ("coil-regions"), hydrophilic regions, 
hydrophobic regions, alpha amphipathic regions, beta 
amphipathic regions, flexible regions, surface-forming 
regions and high antigenic index regions of Topoisomerase 
15 III. 

Further preferred regions are those that mediate activities 
of Topoisomerase III. Most highly preferred in this regard 
are fragments that have a chemical, biological or other 
activity of Topoisomerase 111, including those with a similar 

20 activity or an improved activity, or with a decreased unde- 
sirable activity. Routinely one generates the fragment by 
well-known methods then compares the activity of the 
firagment to nam re Topoismerase I in a convenient assay 
such as listed hereinbelow. Highly preferred in this regard 

25 are fragments that contain regions that arc homologs in 
sequence, or in position, or in both sequence and to active 
regions of related polypeptides, such as the related polypep- 
tides set out in FIG. 2 (SEQ ID NO: 2), which include E. coli 
topoisomerase III and H. influenzae topoisomerase IIL 

30 Among particularly preferred fragments in these regards are 
truncation mutants, as discussed above. Further preferred 
polynucelotide fragments are those that are antigenic or 
immunogenic in an animal, especially in a human. 

It will he appreciated that the invention also relates to, 

3.S among others, polynucleotides encoding the aforementioned 
fragments, polynucleotides that hybridize to polynucleotides 
encoding the fragments, particularly those that hybridize 
under stringent conditions, and polynucleotides, such as 
PCR primers, for amplifying polynucleotides that encode 

40 the fragments. In these regards, preferred polynucleotides 
are those that correspondent to the preferred fragments, as 
discussed above. 
Vectors, host cells, expression 
The present invention also relates to vectors which com- 

45 prise a polynucleotide or polynucleotides of the present 
invention, host cells which are genetically engineered with 
vectors of the invention and the production of polypeptides 
of the invention by recombinant techniques. 
Host cells can be genetically engineered to incorporate 

50 polynucleotides and express polypeptides of the present 
invention. Introduction of a polynucleotides into the host 
cell can be affected by calcium phosphate transfcction, 
DEAE-dextran mediated transfection, transvection, 
microinjection, cationic lipid-mediated transfection, 

55 electroporation, transduction, scrape loading, ballistic 
introduction, infection or other methods. Such methods are 
described in many standard laboratory manuals, such as 
Davis et al., BASIC METHODS IN MOLECULAR 
BIOLOGY, (1986) and Sambrook et al., MOLECULAR 

60 CLONING: A lABORATORY MANUAL, 2nd Ed., Cold 
Spring Harbor I.^boratory Press, Cold Spring Harbor, N.Y. 
(1989). 

Polynucleotide constructs in host cells can be used in a 
conventional manner to produce the gene product encoded 
65 by the recombinant sequence. Alternatively, the polypep- 
tides of the invention can be synthetically produced by 
conventional peptide synthesizers. 
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Mature proteins can be expressed in mammalian cells, including, for instance, a promoter to direct mRNA tran- 

yeast, bacteria, or other cells under the control of appropriate scription. Representatives of such promoters include, but are 

promoters. Cell-free translation systems can also be not limited to, the phage lambda PL promoter, the co/c lac, 

employed to produce such proteins using RNAs derived trp and tac promoters, the SV40 early and late promoters and 

from Uie DNA constructs of the present invention. Appro- 5 promoters of retroviral LTRs. 

priate cloning and expression vectors for use with prokary- In general, expression constructs will contain sites for 

olic and eukaryolic hosts are described by vSambrook et al., transcription initiation and termination, and, in the tran- 

MOLECULAR CLONING: A LABORATORY MANUAL, 2n6 scribed region, a ribosome binding site for translation. The 

Ed., Cold Spring Harbor Laboratory Press, Cold Spring coding portion of the mature transcripts expressed by the 

Harbor, N.Y. (1989). lo constructs will include a translation initiating AUG at the 

In accordance with this aspect of the invention the vector beginning and a termination codon appropriately positioned 

may be, for example, a plasmid vector, a single or double- at the end of the polypeptide to be translated, 

stranded phage vector, a single or double-stranded RNA or In addition, the constructs may contain control regions 

DNA viral vector. Plasm ids generally are designated herein that regulate as well as engender expression. Generally, in 

by a lower case p preceded and/or followed by capital letters 15 accordance with many commonly practiced procedures, 

and/or numbers, in accordance with standard naming con- such regions will operate by controlling transcription, such 

ventions that are famihar to those of skill in the art. Starting as transcription factors, repressor binding sites and 

plasmids disclosed herein are either commercially available, termination, among others. 

publicly available, or can be constructed from available Vectors for propagation and expression generally will 

plasmids by routine application of well known, published 20 include selectable markers and amplification regions, such 

procedures. Many plasmids and other cloning and expres- a.s, for example, those set forth in Sambrook et al., 

sion vectors that can be used in accordance with the present MOLECULAR CLONING, A LABORATORY MANUAL, 2nd 

invention are well known and readily available to those of Ed.; Cold Spring Harbor Laboratory Press, Cold Spring 

skiU in the art. Harbor, N.Y. (1989). 

Preferred among vectors, in certain respects, are those for 25 Representative examples of appropriate hosts include 

expression of polynucleotides and polypeptides of the bacterial cells, such as streptococci, staphylococci, E. coli, 

present invention. Generally, such vectors comprise cis- streptomyces and /?«a7/M5 5w/?n7/s cells; fungal cells, such as 

acting control regions effective for expression in a host yeast cells and Aspergillus cells; insect cells such as Droso- 

operatively linked to the polynucleotide to be expressed. phila S2 and Spodoptera Sf9 cells; animal cells such as 

Appropriate trans-acting factors either are supplied by the 30 CHO, COS, HeLa, C127, 3T3, BHK, 293 and Bowes 

host, supplied by a complementing vector or supplied by the melanoma cells; and plant cells. 

vector itself upon introduction into the host. llie following vectors, which are commercially available. 

In certain preferred emboimenls in this regard, the vectors are provided by way of example. Among vectors preferred 

provide for specific expression. Such specific expression for use in bacteria are pQE70, pQE60 and pQE9, available 

may be inducible expression or expression only in certain 35 from Qiagen; pBS vectors, Phagescript vectors, Bluescript 

typesof cells or both inducible and cell-specific. Particularly vectors, pNH8A, pNH16a, pNHlSA, pNH46A, available 

preferred among inducible vectors arc vectors that can be from Stratagcne; and ptrc99a, pKK223-3, pKK233-3, 

induced for expression by environmental factors that are pDR540, pRIT5 available from Pharmacia, and pBR322 

easy to manipulate, such as temperature and nutrient addi- (ATCC 37017). Among preferred eukaryotic vectors are 

tives. A variety of vectors suitable to this aspect of the 40 pWLNEO, pSV2CAr, pOG44, pXTl and pSG available 

invention, including constitutive and inducible expression from Stratagene; and pSVK3, pBPV, pMSG and pSVL 

vectors for use in prokaryotic and eukaryotic hosts, are well available from Pharmacia. These vectors are listed solely by 

known and employed routinely by those of skill in the art. way of illustration of the many commercially available and 

A great variety of expression vectors can be used to well known vectors that are available to those of skill in the 

expressapolypeptideof the invention. Such vectors include, 45 art for use in accordance with this aspect of the present 

among others, chromosomal, episomal and virus-derived invention. It will be appreciated that any other plasmid or 

vectors, e.g., vectors derived from bacterial plasmids, from vector suitable for, for example, introduction, maintenance, 

bacteriophage, from transposons, from yeast episomes, from propagation or expression of a polynucleotide or polypep- 

insertion elements, from yeast chromosomal elements, from tide of the invention in a host may be used in this aspect of 

viruses such as baculoviruses, papova viruses, such as 50 the invention. 

SV40, vaccinia viruses, adenoviruses, fowl pox viruses. Promoter regions can be selected from any desired gene 
pseudorabies viruses and retroviruses, and vectors derived using vectors that contain a reporter transcription unit lack- 
from combinations thereof, such as those derived from ing a promoter region, such as a chloramphenicol acetyl 
plasmid and bacteriophage genetic elements, such as transferase ("CAT') transcription unit, downstream of 
cosmids and phagemids, all may be used for expression in 55 restriction site or sites for introducing a candidate promoter 
accordance with this aspect of the present invention. fragment; i.e., a fragment that may contain a promoter. As is 
Generally, any vector suitable to maintain, propagate or well known, introduction into the vector of a promoter- 
express polynucleotides to express a polypeptide in a host containing fragment at the restriction site upstream of the cat 
may be used for expression in this regard. gene engenders production of CAI activity, which can be 

The appropriate DNA sequence may be inserted into the 60 detected by standard CAT assays. Vectors suitable to this end 

vector by any of a variety of well-known and routine are well known and readily available, such as pKK232-8 and 

techniques, such as, for example, those set forth in Sam- pCM7. Promoters for expression of polynucleotides of the 

brook et al., MOLECULAR CLONING, A LABORATORY present invention include not only well known and readily 

MANUAL, 2nd Ed.; Cold Spring Harbor Laboratory Press, available promoters, but also promoters that readily may be 

Cold Spring Harbor, N.Y. (1989). 65 obtained by the foregoing technique, using a reporter gene. 

The DNA sequence in the expression vector is operatively Among known prokaryotic promoters suitable for expres- 

linked to appropriate expression control sequence(s), sion of polynucleotides and polypeptides in accordance with 
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the present invention are the E, coli lad and lacZ and 
promoters, the T3 and T7 promoters, the gpt promoter, the 
lambda PR, PL promoters and the trp promoter. 

Among known eukaryotic promoters suitable in this 
regard are the CMV immediate early promoter, the HSV 5 
thymidine kinase promoter, the early and late SV40 
promoters, the promoters of retroviral LlKs, such as those 
of the Rous sarcoma virus ("RSV"), and metallothionein 
promoters, such as the mouse metallothionein-I promoter. 

Recombinant expression vectors will include, for lu 
example, origins of replication, a promoter preferably 
derived from a highly-expressed gene to direct transcription 
of a downstream structural sequence, and a selectable 
marker to permit isolation of vector containing cells after 
exposure to the vector. 15 

Polynucleotides of the invention, encoding the heterolo- 
gous structural sequence of a polypeptide of the invention 
generally will be inserted into the vector using standard 
techniques so that it is operably linked to the promoter for 
expression. The polynucleotide will be positioned so that the 20 
transcription start site is located appropriately 5' to a ribo- 
some binding site. The ribosome binding site will be 5' to the 
AUG that initiates translation of the polypeptide to be 
expressed. Generally, there will be no other open reading 
frames that begin with an initiation codon, usually AUG, and 25 
lie between the ribosome binding site and the initiation 
codon. Also, generally, there will be a translation stop codon 
at the end of the polypeptide and there will be a polyade- 
nylation signal in constructs for use in eukaryotic hosts. 
Transcription termination signal appropriately disposed at 30 
the 3' end of the transcribed region may also be included in 
the polynucleotide construct. 

For secretion of the translated protein into the lumen of 
the endoplasmic reticulum, into the peripla.smic space or 
into the extracellular environment, appropriate secretion 35 
signals may be incorporated into the expressed polypeptide. 
These signals may be endogenous to the polypeptide or they 
may be heterologous signals. 

The polypeptide may be expressed in a modified form, 
such as a fusion protein, and may include not only secretion 40 
signals but also additional heterologous functional regions. 
Thus, for instance, a region of additional amino acids, 
particularly charged amino acids, may be added to the N- or 
C-terminus of the polypeptide to improve stability and 
persistence in the host cell, during purification or during 45 
subsequent handling and storage. Also, region also may be 
added to the polypeptide to facilitate purification. Such 
regions may be removed prior to final preparation of the 
polypeptide. The addition of peptide moieties to polypep- 
tides to engender secretion or excretion, to improve stability so 
or to facilitate purification, among others, arc familiar and 
routine techniques in the art. A preferred fusion protein 
comprises a heterologous region from immunoglobulin that 
is useful to solubilize or purify polypeptides. For example, 
EP-A-0 464 533 (Canadian counterpart 2045869) discloses 55 
fusion proteins comprising various portions of constant 
region of immunoglobulin molecules together with another 
protein or part thereof. In drug discovery, for example, 
proteins have been fused with antibody Fc portions for the 
purpose of high -throughput screening assays to identify 60 
antagonists. See, D. Bennett et al., .Tournal of Molecular 
Recognition, 8: 52-58 (1995) and K. Johanson et al., The 
Journal of Biological Chemistry, 270, (16): 9459-9471 
(1995). 

Cells typically then arc harvested by ccntrifugation, dis- 65 
rupted by physical or chemical means, and the resuhing 
crude extract retained for further purification. 
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Microbial cells employed in expression of proteins can be 
disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell 
lysing agents, such methods are well know to those skilled 
in the art. 

Mammalian expression vectors may comprise an origin of 
replication, a suitable promoter and enhancer, and aLst) any 
necessary ribosome binding sites, polyadenylation sites, 
splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking non -transcribed sequences that 
are necessary for expression. In certain preferred embodi- 
ments in this regard DNA sequences derived from the SV40 
splice sites, and the S V40 polyadenylation sites are used for 
required non -transcribed genetic elements of these types. 

Topoisomerase III polypeptide can be recovered and 
purified from recombinant cell cultures by well-known 
methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydro- 
phobic interaction chromatography, afiinity 
chromatography, hydroxylapatite chromatography and lec- 
tin chromatography. Most preferably, high performance liq- 
uid chromatography ("HPLC") is employed for purification. 
Well known techniques for refolding protein may be 
employed to regenerate active conformation when the 
polypeptide is denatured during isolation and or purification. 

Polypeptides of the present invention include naturaUy 
purified products, products of chemical synthetic 
procedures, and products produced by recombinant tech- 
niques from a prokaryotic or eukaryotic host, including, for 
example, bacterial, yeast, higher plant, insect and mamma- 
lian cells. Depending upon the host employed in a recom- 
binant production procedure, the polypeptides of the present 
invention may be glyco.sylated or may be non -glycosylated. 
In addition, polypeptides of the invention may also include 
an initial modified methionine residue, in some cases as a 
result of host-mediated processes. 

Topoisomerase III polynucleotides and polypeptides may 
be used in accordance with the present invention for a 
variety of applications, particularly those that make use of 
the chemical and biological properties of Topoisomerase III. 
Additional applications relate to diagnosis and to treatment 
of disorders of cells, tissues and organisms. These aspects of 
the invention are illustrated further by the following discus- 
sion. 

Polynucleotide assays 

lliis invention is alst) related to the use of the Topoi- 
somerase III polynucleotides to detect complementary poly- 
nucleotides such as, for example, as a diagnostic reagent. 
Detection of a bacterial Topoisomerase III in a eukaryote, 
particularly a mammal, and especially a human, will provide 
a diagnostic method that can add to, define or allow a 
diagnosis of a disease. Eukaryotes (herein also "individual 
(s)"), particularly mammals, and especially humans, 
infected by a Topoisomerase III producing bacterium may be 
detected at the DNAor RNAlevel by a variety of techniques. 
Nucleic acids for diagnosis may be obtained from an indi- 
vidual's cells and tissues, such as bone, blood, muscle, 
cartilage, and skin. Tissue biopsy and autopsy material is 
also preferred for samples from an individual to use in a 
diagnostic assay. The bacterial DNA may be used directly 
for detection or may be amplified enzymatically by using 
PCR prior to analysis. PCR (Saiki et al., Nature 324: 
163-166 (1986)). RNA or cDNA may also be used in the 
same ways. As an example, PCR primers complementary to 
the nucleic acid encoding Topoisomerase III can be used to 
identify and analyze Topoisomerase III presence and expres- 
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sion. Using PGR, characterization of the strain of prokaryote 
present in a eukaryote, particularly a mammal, and espe- 
cially a human, may be made by an analysis of the genotype 
of the prokaryote gene. For example, deletions and inser- 
tions can be detected by a change in size of the amplified 5 
product in comparison to the genotype of a reference 
sequence. Point mutations can be identified by hybridizing 
amplified DNA to radiolabeled Topoisomerase III RNA or 
alternatively, radiolabeled Topoisomerase III antisense DNA 
sequences. Perfectly matched sequences can be distin- lo 
guishcd from mismatched duplexes by RNase A digestion or 
by differences in melting temperatures. 

Sequence differences between a reference gene and genes 
having mutations also may be revealed by direct DNA 
sequencing. In addition, cloned DNA segments may be 15 
employed as probes to detect specific DNA segments. The 
sensitivity of such methods can be greatly enhanced by 
appropriate use of PCR or another amplification method. For 
example, a sequencing primer is used with double -stranded 
PCR product or a single-stranded template molecule gener- 20 
ated by a modified PCR. The sequence determination is 
performed by conventional procedures with radiolabeled 
nucleotide or by automatic sequencing procedures with 
fluorescent-tags. 

Genetic typing of various strains of bacteria based on 25 
DNA sequence differences may be achieved by detection of 
alteration in electrophoretic mobility of DNA fragments in 
gels, with or without denaturing agents. Small sequence 
deletions and insertions can be visualized by high resolution 
gel electrophoresis. DNA fragments of different sequences 30 
may be distinguished on denaturing formamide gradient gels 
in which the mobilities of different DNA fragments are 
retarded in the gel at different positions according to their 
specific melting or partial melting temperatures (see, e.g., 
Myers et al., Science, 230: 1242 (1985)). 35 

Sequence changes at specific locations also may be 
revealed by nuclease protection assays, such as RNase and 
SI protection or the chemical cleavage method (e.g., Cotton 
et al., Proc. Nat'i Acad. ScL, USA, 85: 4397^01 (1985)). 

Thus, the detection of a specific DNA sequence may be 40 
achieved by methods such as hybridization, RNase 
protection, chemical cleavage, direct DNAsequencing or the 
use of restriction enzymes, (e.g., restriction fragment length 
polymorphisms ("RFLP") and Southern blotting of genomic 
DNA. 45 

In addition to more conventional gel-electrophoresis and 
DNA sequencing, mutations also can be detected by in situ 
analysis. 

Cells carrying mutations or polymorphisms in the gene of 
the present invention may also be detected at the DNA level so 
by a variety of techniques, to allow for serotyping, for 
example. Nucleic acids for diagnosis may be obtained firom 
an infected individual's cells, including but not limited to 
blood, lurine, saliva, tissue biopsy and autopsy material or 
from bacteria isolated and cultutered from the above 55 
sources. The bacterial DNA may be used directly for detec- 
tion or may be amplified enzymatically by using PCR (Saiki 
et al.. Nature, 324:163-166 (1986)) prior to analysis. 
RT-PCR can also be used to detect mutations. It is particu- 
larly preferred to used RT-PCR in conjunction with auto- 60 
mated detection systems, such as, for example, GeneScan. 
RNA or cDNA may also be used for the same purpose, PCR 
or RT-PCR. As an example, PCR primers complementary to 
the nucleic acid encoding Topoisomerase III can be made 
using known methods and used to identify and analyze 65 
mutations. They may also be used to obtained full length 
gene sequence using known methods, such as, for example, 



PCR. Examples of such primers include, but are not limited 
to, 5'-TAAAAGAACGTATGAGAAAG-3' [SEQ ID N0:4] 
(upper primer) and 

5'-AAAAACAATACCAAAAGCGAACT-3' [SEQ ID 
N0:5] (lower primer). For example, deletions and insertions 
can be detected by a change in size of the amplified product 
in comparison to the normal genotype. Point mutations can 
be identified by hybridizing amplified DNA to radiolabeled 
RNA or alternatively, radiolabeled antisense DNA 
sequences. Perfectly matched sequences can be distin- 
guished from mismatched duplexes by RNase A digestion or 
by differences in melting temperatures. These primers may 
be used for amplifying Topoisomerase 111 cDNA isolated 
from a sample derived from an individual. The invention 
also provides such primers with 1, 2, 3 or 4 nucleotides 
removed from the 5' and/or the 3' end. The primers may be 
used to amplify the gene isolated from the individual such 
that the gene may then be subject to various techniques for 
elucidation of the DNA sequence. In this way, mutations in 
the DNA sequence may be detected. 
Polypeptide assays 

The present invention also relates to a diagnostic assays 
such as quantitative and diagnostic assays for detecting 
levels of Topoisomerase III protein in cells and tissues, 
including determination of normal and abnormal levels. 
Thus, for instance, a diagnostic assay in accordance with the 
invention for detecting expression of Topoisomerase III 
protein compared to normal control tissue samples may be 
used to detect the presence of an infection. Assay techniques 
that can be used to determine levels of a protein, such as an 
Topoisomerase III protein of the present invention, in a 
sample derived from a host are well-known to those of skill 
in the art. Such assay methods include radioimmunoassays, 
competitive-binding assays. Western Blot analysis and 
ELISA assays. Among these EIJSAs frequently are pre- 
ferred. An ELISA assay initially comprises preparing an 
antibody specific to Topoisomerase III, preferably a mono- 
clonal antibody. In addition a reporter antibody generally is 
prepared which binds to the monoclonal antibody. The 
reporter antibody is attached a detectable reagent such as 
radioactive, fluorescent or enzymatic reagent, in this 
example horseradish peroxidase enzyme. 
Antibodies 

The polypeptides, their fragments or other derivatives, or 
analogs thereof, or cells expressing them can be used as an 
immunogen to produce antibodies thereto. The present 
invention includes, for examples monoclonal and polyclonal 
antibodies, chimeric, single chain, and humanized 
antibodies, as well as Fab fragments, or the product of an 
Fab expression library. 

Antibodies generated against the polypeptides corre- 
sponding to a sequence of the present invention can be 
obtained by direct injection of the polypeptides into an 
animal or by administering the polypeptides to an animal, 
preferably a nonbuman. The antibody so obtained will then 
bind the polypeptides itself. In this manner, even a sequence 
encoding only a fragment of the polypeptides can be used to 
generate antibodies binding the whole native polypeptides. 
Such antibodies can then be used to isolate the polypeptide 
from tissue expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique 
known in the art which provides antibodies produced by 
continuous cell line cultures can be used. Examples include 
various techniques, such as those in Kohler, G. and Milstein, 
C, Nature 256: 495-^97 (1975); Kozbor et dA.Jmmimology 
Today 4: 72 (1983); Cole et al., pg. 77^96 in MONO- 
CLONAL ANTIBODIES AND CANCER THERAPY, M&n R. 
Liss, Inc. (1985). 
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Techniques described for the production of single chain thorpe et al., Hu/n. Gene Ther. 1963:4: 419 (1963), delivery 

antibodies (U.S. Pat. No. 4,946,778) can be adapted to of DNAcomplcxed with specific protein carriers (Wu et al., 

produce single chain antibodies to immunogenic polypep- J. Biol. Chem. 264:16985 (1989), coprecipitation of DNA 

tide products of this invention. Also, transgenic mice, or with calcium phosphate (Benvenisty and Reshef, Proc. 

other organisms such as other mammals, may be used to 5 S^^- (USAX:S3;9551 (1986). encapsulation of 

express humanized antibodies to immunogenic polypeptide DNA in various forms of liposomes (Kaneda et al.. Science 

products of this invention. 243:375 (1989), particle bombardment (Tang et al.. Nature, 

Alternatively phage display technology could be utilized (1^92) 356:152, Eisenbraun el al, DNA Cell Biol 12:791 

to select antibody genes with binding activities towards the 1^^^) l'"^^ ^^l^^^^}^^" ^''^^ ' n q«T 

1 . J ^il £• . • c nr^n i c A (Seei^eretal., Froc. Nat 1 Acad. Sci. (USA) 81:5849 (1984). 

polypeptide either from repertoires or PCR ampliiied lo \. ^. iti i • j- i i j 

^ 1 r , AC Topoisomerase III binduig molecules and assays 

v-gcnes of lymphocytes from humans screened for possess- invention also provides a method for identification of 

ing anti-Fbp or from naive hbranes (McCaffcrty, J. et al., molecules, such as binding molecules, that bind Topoi- 

Nature 348, 552-554 (1990); Marks, J. etal.,B/orec/2«ofogy somerase III. Genes encoding proteins that bind Topoi- 

10: 779-783 (1992). The affinity of these antibodies can also somerase III, such as binding proteins, can be identified by 

be improved by chain shuffling (Clackson, T. et Nature 15 numerous methods known to those of skill in the art, for 

352, 624-628 (1991). example, Ugand panning and FACS sorting. Such methods 

If two antigen binding domains are present each domain are described in many laboratory manuals such as, for 

may be directed against a different epitope — termed 'bispe- instance, Coligan et al.. Current Protocols in Immunology 

cific' antibodies. 1(2): Chapter 5 (1991). 

The above-described antibodies may be employed to 20 For instance, expression cloning may be employed for 

isolate or to identify clones expressing the polypeptide or this pur{50se. To this end polyadenylated RNA is prepared 

purify the polypeptide of the present invention by attach- from a cell expressing Topoisomerase III, a cDNA library is 

ment of the antibody to a solid support for isolation and/or created from this RNA, the library is divided into pools and 

purification by affinity chromatography. the pools are transfected individually into cells that are not 

Thus among others, antibodies against Topoisomerase III 25 expressing to Topoisomerase III. The transfected cells then 

may be employed to inhibit and/or treat infections, particu- are exposed to labeled Topoisomerase III. Topoisomerase III 

larly bacterial infections, and especially Staphylococcal can be labeled by a variety of well-known techniques 

infections as well as to monitor the effectiveness of antibi- including standard methods of radio-iodination or inclusion 

otic treatment. of a recognition site for a site-specific protein kinase.) 

Polypeptide derivatives include antigenically, epilopically 30 Following exposure, the cells are fixed and binding of 

or immunologically equivalent derivatives which form a Topoisomerase III is determined. These procedures conve- 

parlicular aspect of this invention. ITie term "antigenically niently are carried out on glass slides, 

equivalent derivative" as used herein encompasses a Pools are identified of cDNA that produced Topoi- 

polypeptide or its equivalent which will be specifically somera.se Ill-binding cells. Sub-pools are prepared from 

recognized by certain antibodies which, when rai.sed to the 35 these positives, transfected into host cells and screened as 

protein or polypeptide according to the present invention, described above. Using an iterative sub -pooling and 

interfere with the immediate physical interaction between rc-scrccning process, one or more single clones that encode 

pathogen and mammalian host. The term "immunologically the putative binding molecule, such as a binding molecule, 

equivalent derivative" as used herein encompasses a peptide can be isolated. 

or its equivalent which when used in a suitable formulation 40 Alternatively a labeled Ugand can be photoafifinity linked 

to raise antibodies in a vertebrate, the antibodies act to to a cell extract, such as a membrane or a membrane extract, 

interfere with the immediate physical interaction between prepared from cells that express a molecule that it binds, 

pathogen and mammalian host. such as a binding molecule. Cross-linked material is 

The polypeptide, such as an antigenically or immunologi- resolved by polyacrylamide gel electrophoresis ("PAGE") 

cally equivalent derivative or a fusion protein thereof is used 45 and exposed to X-ray film. The labeled complex containing 

as an antigen to immunize a mouse or other animal such as the ligand-binding can be exci.sed, resolved into peptide 

a rat or chicken. 1lie fusion protein may provide stability to fragments, and subjected to protein microsequencing. llie 

the polypeptide. The antigen may be associated, for example amino acid sequence obtained from microsequencing can be 

by conjugation, with an immunogenic carrier protein for used to design unique or degenerate oligonucleotide probes 

example bovine serum albumin (BSA) or keyhole limpet 50 to screen cDNA libraries to identity genes encoding the 

haemocyanin (KLH). Alternatively a muhiplc antigenic pep- putative binding molecule. 

tide comprising multiple copies of the protein or Polypeptides of the invention also can be used to assess 

polypeptide, or an antigenically or immunologically equiva- Topoisomerase III binding capacity of Topoisomerase III 

lent polypeptide thereof may be sufficiently antigenic to binding molecules, such as binding molecules, in cells or in 

improve immunogenicity so as to obviate the use of a carrier. 55 cell-free preparations. 

Preferably the antibody or derivative thereof is modified Polypeptides of the invention may also be used to assess 

to make it less immunogenic in the individual. For example, the binding or small molecule substrates and ligands in, for 

if the individual is human the antibody may most preferably example, cells, cell-free preparations, chemical libraries, 

be "humanised"; where the complimentarity determining and natural product mixtures. These substrates and ligands 

region(s) of the hybridoma-derived antibody has been trans- 60 may be natural substrates and ligands or may be structural or 

planted into a human monoclonal antibody, for example as functional mimelics. 

described in Jones, P. et al.. Nature 321: 522-525 (1986)or Antagonists and Agonists — assays and molecules 

Tempest et A., Biotechnology 9\ 266-273 (1991). As mentioned above, both increases and decreases in 

The use of a polynucleotide of the invention in genetic DNA density have been associated with bacterial responses 

immunization will preferably employ a suitable delivery 65 to environmental challenges. Accordingly, modulating, i.e., 

method such as direct injection of plasmid DNA into agonizing or antagonizing, the appropriate response could 

muscles (Wolff et al., Hum Mol Genet 1:363 (1992). Man- result in a potential antibiotic effect. 
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The invention also provides a method of screening com- converted into product, a reporter gene that is responsive to 

pounds to identify those which enhance or block the action changes in Topoisomerase III activity, and binding assays 

of Topoisomerase III on cells, such as its interaction with known in the art. 

substrate molecules, such as supercoiled DNA. Compounds Potential antagonists include small organic molecules, 

which block the action of Topoisomerase III on cells include 5 peptides, polypeptides and antibodies that bind to a polypep- 

those which act as poisons and stabilize 'topoisomerase III tide of the invention and thereby inhibit or extinguish its 

in a covalent complex with DNA, resulting in an inhibitory activity, or stabilize the key covalenl reaction intermediate 

effect on cell growth. An antagonist is a compound which with DNA Potential antagonists also may be small organic 

decreases the natural biological functions of Topoisomerase molecules, a peptide, a polypeptide such as a closely related 

in. An agonist is a compound which increases the natural lo protein or antibody that binds the same sites on a binding 

biological functions of Topoisomerase III. molecule, such as a binding molecule, without inducing 

Barrett et al., Antimicrob. Agents Chemother. 34:1 (1990) Topoisomerase Ill-induccd activities, thereby preventing the 

review in-vitro assays which can be used to measure inhi- action of Topoisomerase III by excluding Topoisomerase HI 

bition of topoisomerases. These assays can be categorized as from binding. 

catalytic assays and noncatalytic assays. Catalytic assays for 15 Potential antagonists include a small molecule which 

bacterial topoisomerase III include, for example, assays to binds to and occupies the binding site of the polypeptide 

measure the relaxation of supercoiled DNA. Noncatalytic thereby preventing binding to cellular binding molecules, 

assays, also known as 'cleavable complex' assays, measure such as binding molecules, such that normal biological 

the formation of a key covalent reaction intermediate. activity is prevented. Examples of small molecules include 

Froelich-Ammon and Osherolf J. Biol Chem. 270:21429 20 but are not limited to small organic molecules, peptides or 

(1995) review the mechanistic basis of noncatalytic assays peptide-like molecules. 

of topoisomerase poisons. Other potential antagonists include antisense molecules. 

Supercoiled DNA relaxation a.ssay Antisense technology can be used to control gene expression 

To screen for inhibitors of the relaxation reaction, a through antisense DNA or RNA or through double- or 

candidate inhibitor and a preparation of Topoisomerase III 25 triple-helix formation. Antisense techniques arc discussed, 

are incubated with a supercoiled DNA substrate, for example for example, in — Okano, J. Neurochenm 56: 560 (1991); 

plasmid or phage DNA, in an appropriate buffer containing OUGODEOXYNUCLEOTIDES AS ANTISENSE JNHIBl- 

Mg^*, or an alternative divalent metal ion. Reaction prod- TORS OF GENE EXPRESSION, CRC Press, Boca Raton, 

ucts are separated by agarose gel electrophoresis, visualized Fla. (1988). Triple helix fontnation is discussed in, for 

by ethidium bromide staining, and quantified by densitom- 30 instance Lee at al.. Nucleic Acids Research 6: 3073 (1979); 

etry. Cooney et al.. Science 241: 456 (1988); and Dervan at al., 

DNA oligomer cleavage assay Science 251: 1360 (1991). ITie methods are based on bind- 

A single stranded DNA oligomer conlaning appropriate ing of a polynucleotide lo a complementary DNA or RNA. 

cleavage sites, for example the 22mer GAATGAGCCG- For example, the 5' coding portion of a polynucleotide that 

CAACTTCGGGAT (SEQ ID NO: 3), or an appropriately 35 encodes the mature poljq^eptide of the present invention may 

labelled derivative, may be used as substrate. An appropriate be used to design an antisense RNA oligonucleotide of from 

label may be a radiolabel or a fluorescent chromophorc about 10 to 40 base pairs in length. A DNA oligonucleotide 

attached at the 5' or 3' end of the oligo, according to the is designed to be complementary to a region of the gene 

specific assay used. The substrate is incubated with a can- involved in transcription thereby preventing transcription 

didate inhibitor and a preparation of Topoisomerase III, in an 40 and the production of Topoisomerase III. The antisense RNA 

appropriate buffer. The buffer may contain Mg^"^ or an oligonucleotide hybridizes to the MRNA in vivo and blocks 

alternative divalent metal ion. Mg^"*^ is not essential for the translation of the MRNA molecule into Topoisomerase III 

cleavage reaction, although its inclusion may be desirable to polypeptide. The oligonucleotides described above can also 

facilitate the interaction of certain classes of inhibitors. The be delivered to cells such that the antisense RNA or DNA 

reaction is stopped by the addition of an appropriate 45 may be expressed in vivo to inhibit production of Topoi- 

denaturant, for example 1% SDS or 100 mM NaOH. Gen- somerase III. 

eralion of the cleavable ctimplex (stabilization of the key Preferred potential antagonists include compounds related 

covalenl reaction intermediate) may be measured by a to and derivatives of each of the DNA sequences provided 

number of methods. For example, electrophoresis using a herein may be used in the discovery and development of 

denaturing polyacrylamide gel can be used to separate the 5' 50 antibacterial compounds. The encoded protein upon expres- 

labelled cleaved DNA product which may then be quantified sion can be used as a target for the screening of antibacterial 

by densitometry. Alternatively, the 3' labelled DNA product drugs. Additionally, the DNA sequences encoding the amino 

may be assayed by virtue of its covalent association with terminal regions of the encoded protein or Shine-Delgamo 

Topoisomerase III. This may be performed by the SDS/K or other translation facilitating sequences of the respective 

precipitation assay, in which radiolabelled DNA associated 55 mRNA can be used to construct antisense sequences to 

with precipitated protein is measured, or by a capture assay control the expression of the coding sequence of interest, 

format in which Topoisomerase III is immobilized using an The antagonists and agonists of the invention may be 

antibody and the amount of associated labelled DNA is employed in a composition with a pharmaceutically accept- 

measured. able carrier, e.g., as hereinafter described. 

Whole cell assays 60 The antagonists and agonists may be employed for 

Topoisomerase Ill-like elfecLs of potential agonisLs and instance to inhibit staphylococcal infections, 

antagonists and poisons, may by measured, for iastance, by Vaccines 

determining activity of a reporter system that is sensitive to Another aspect of the invention relates to a method for 
alterations in gene expression following interaction of the inducing an immunological response in an individual, par- 
candidate molecule with a cell or appropriate cell prcpara- 65 ticularly a mammal which comprises inoculating the indi- 
tion. Reporter systems that may be useful in this regard vidual with Topoisomerase III, or a antigenic fragment or 
include but are not limited to colorimetric labeled substrate variant thereof, adequate to produce antibody to protect said 
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individual from infection, particularly bacterial infection or organisms, such as a pharmaceutical carrier suitable for 

and most particularly Staphylococcal infection. Yet another administration to a subject. Such compositions comprise, for 

aspect of the invention relates to a method of inducing instance, a media additive or a therapeutically effective 

immunological response in an individual which comprises, amount of a polypeptide of the invention and a phanmac-eu- 

through gene therapy, delivering gene encoding Topoi- 5 lically acceptable carrier or excipient. Such carriers may 

somerase III, or a antigenic fragment or a variant thereof, for include, but are not limited to, saline, buffered saline, 

expressing Ibpoisomerase III, or a fragment or a variant dextrose, water, glycerol, ethanol and combinations thereof, 

thereof in vivo in order to induce an immunological The formulation should smt the mode of admimstration. 

response to produce antibody to protect said individual from If^ . . ,u 

disease *^ l^e mvention further relates to diagnostic and pharma- 

. r \, . r 1. • 1 . ceutical packs and kits comprising one or more containers 

A further aspect of the invention relates to an immuno- gjj^^ ^-i;^ ^^^^ in^edients of the aforemen- 

logical composition which when introduced mto a host ^^^^^ compositions of the invention. Associated with such 

capable or havmg induced within it an immunological container(s) can be a notice in the form prescribed by a 

response, mduces an immunological response m such host to governmental agency regulating the manufacmre, use or sale 

a Topoisomerase III or protein coded therefrom, wherein the 15 of pharmaceuticals or biological products, reflecting 

composition comprises a recombinant Topoisomerase III or approval by the agency of the manufacture, use or sale of the 

protein coded therefrom comprising DNA which codes for product for human administration, 

and expresses an antigen of said topoisomerase III or Administration 

protein coded therefrom. Polypeptides and other compounds of the present inven- 

The Topoisomerase III or a fragment thereof may be fused 20 lion may be employed alone or in conjunction with other 

with co-protein which may not by itself produce antibodies, compounds, such as therapeutic compounds, 

but is capable of stabilizing the first protein and producing The pharmaceutical compositions may be administered in 

a fused protein which will have immunogenic and protective any effective, convenient manner including, for instance, 

properties. Thus fused recombinant protein, preferably fur- administration by topical, oral, anal, vaginal, intravenous, 

thcr comprises an antigenic co-protein, such as Glutathione- 25 intraperitoneal, intramuscular, subcutaneous, intranasal or 

S-transferase (GST) or beta-galactosidase, relatively large intradermal routes among others. 

co-proteins which solubilise the protein and facilitate pro- The pharmaceutical compositions generally are adminis- 

duction and purification thereof. Moreover, the co-protein tered in an amount effective for treatment or prophylaxis of 

may act as an adjuvant in the sense of providing a gener- a specific indication or indications. In general, the compo- 

alized stimulation of the immune system. The co-protein 30 sitions are administered in an amount of at least about 10 

may be attached to either the amino or carboxy terminus of /^g/kg body weight. In most cases they will be administered 

the first protein. in an amount not in excess of about 8 mg/kg body weight per 

'Ihe present invention also includes a vaccine formulation day. Preferably, in most cases, dose is fi"om about 10 /vg/kg 

which comprises the immunogenic recombinant protein to about 1 mg/kg body weight, daily. It will be appreciated 

together with a suitable carrier. Since the protein may be ,15 that optimum dosage will be determined by standard meth- 

broken down in the stomach, it is preferably administered ods for each treatment modality and indication, taking into 

parcnterally, including, for example, administration that is account the indication, its severity, route of administration, 

subcutaneous, intramuscular, intravenous, or intradermal. complicating conditions and the like. 

Formulations suitable for parenteral administration include In therapy or as a prophylactic, the active agent may be 

aqueous and non-aqueous sterile injection solutions which 40 administered to an individual as an injectable composition, 

may contain anti-oxidants, buffers, bacteriostals and solutes for example as a sterile aqueous dispersion, preferably 

which render the formulation instonic with the bodily fluid, isotonic. 

preferably the blood, of the individual; and aqueous and Alternatively the composition may be formulated for 
non-aqueous sterile suspensions which may include sus- topical application for example in the form of ointments, 
pending agents or thickening agents. The formulations may. 45 creams, lotions, eye ointments, eye drops, ear drops, 
be presented in unit-dose or multi-dose containers, for mouthwash, impregnated dressings and sutures and 
example, sealed ampoules and vials and may be stored in a aerosols, and may contain appropriate conventional 
freeze-dried condition requiring only the addition of the additives, including, for example, preservatives, solvents to 
sterile liquid carrier immediately prior to use. The vaccine assist drug penetration, and emollients in ointments and 
formulation may also include adjuvant systems for enhanc- so creams. Such topical formulations may also contain com- 
ing the immunogenicity of the formulation, such as oil -in patible conventional carriers, for example cream or ointment 
water systems and other systems known in the art. The bases, and ethanol or oleyl alcohol for lotions. Such carriers 
dosage will depend on the specific activity of the vaccine may constitute from about 1% to about 98% by weight of the 
and can be readily determined by routine experimentation. formulation; more usually they will constitute up to about 

Whilst the invention has been described with reference to 55 80% by weight of the formulation, 

certain Topoisomerase III, it is to be understood that this For administration to mammals, and particularly humans, 

covers fragments of the naturally occurring protein and it is expected that the daily dosage level of the active agent 

similar proteins (for example, having sequence homologies will be from 0.01 mg/kg to 10 mg/kg, typically around 1 

of 50% or greater) with additions, deletions or substitutions mg/kg. The physician in any event will determine the actual 

which do not substantially affect the immunogenic proper- 60 dosage which will be most suitable for an individual and will 

ties of the recombinant protein. vary with the age, weight and response of the particular 

Compositions individual. The above dosages are exemplary of the average 

The invention also relates to compositions comprising the case. There can, of course, be individual instances where 

polynucleotide or the polypeptides discussed above or the higher or lower dosage ranges are merited, and such are 

agonists or antagonists. Thus, the polypeptides of the present 65 within the scope of this invention. 

invention may be employed in combination with a non- In-dwclling devices include surgical implants, prosthetic 

sterile or sterile carrier or carriers for use with cells, tissues devices and catheters, i.e., devices that are introduced to the 
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body of an individual and remain in position for an extended 
time. Such devices include, for example, artificial joints, 
heart valves, pacemakers, vascular grafts, vascular catheters, 
cerebrospinal fluid shunts, urinary catheters, continuous 
ambulatory peritoneal dialysis (CAPD) catheters, etc. 

llie composition of the invention may be administered by 
injection to achieve a systemic effect against relevant bac- 
teria shortly before insertion of an in-dwelling device. 
Treatment may be continued after surgery during the 
in-body time of the device. In addition, the composition 
could also be used to broaden perioperative cover for any 
surgical technique to prevent Staphylococcal wound infec- 
tions. 

Many orthopedic surgeons consider that humans with 
prosthetic joints should be considered for antibiotic prophy- 
laxis before dental treatment that could produce a bacter- 
aemia. Late deep infection is a serious complication some- 
times leading to loss of the prosthetic joint and is 
accompanied by significant morbidity and mortality. It may 
therefore be possible to extend the use of the active agent as 
a replacement for prophylactic antibiotics in this situation. 

In addition to the therapy described above, the composi- 
tions of this invention may be used generally as a wound 
treatment agent to prevent adhesion of bacteria to matrix 
proteins exposed in wound tissue and for prophylactic use in 
dental treatment as an alternative to, or in conjunction with, 
antibiotic prophylaxis. 

Alternatively, the composition of the invention may be 
used to bathe an indwelling device immediately before 
insertion. The active agent will preferably be present at a 
concentration of 1 /fg/ml to 10 rag/ml for bathing of wounds 
or indwelling devices. 

A vaccine composition is conveniently in injectable form. 
Conventional adjuvants may be employed to enhance the 
immune response. 

A suitable unit dose for vaccination is 0.5-5 microgram/ 
kg of antigen, and such dose is preferably administered 1-3 
times and with an interval of 1-3 weeks. 

With the indicated dose range, no adverse toxicological 
effects will be observed with the compounds of the invention 
which would preclude their administration to suitable indi- 
viduals. 

The antibodies described above may also be used as 
diagnostic reagents to detect the presence of bacteria con- 
taining Topoisomerase. 

Each reference disclosed herein is incorporated by refer- 
ence herein in its entirety. Any patent application to which 
this application claims priority is also incorporated by 
reference herein in its entirety. 

In order to facilitate understanding of the following 
example certain frequently occurring methods and/or terms 
will be described. 

EXAMPLES 

The present invention is further described by the follow- 
ing examples. The examples are provided solely to illustrate 
the invention by reference to specific embodiments. These 
exemplification's, while illustrating certain specific aspects 
of the invention, do not portray the limitations or circum- 
scribe the scope of the disclosed invention. 

Certain terms used herein are explained in the foregoing 
glossary. 

All examples were carried out using standard techniques, 
which are well known and routine to those of skill in the art, 
except where otherwise described in detail. Routine molecu- 
lar biology techniques of the following examples can be 
carried out as described in standard laboratory manuals, such 
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as Sambrook et al., MOLECULAR CLONING: A LABORA- 
TORY MANUAL, 2nd Ed.; Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, N.Y (1989). 

All parts or amounts set out in the following examples are 
5 by weight, unless otherwise specified. 

Unless otherwise stated size separation of fragments in 
the examples below was carried out using standard tech- 
niques of agarose and polyacrylamide gel electrophoresis 
("PAGE")in Sambrook et al, MOLECULAR CLONING: A 
^0 lABORATORY MANUAL, 2nd Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y. (1989) and 
numerous other references such as, for instance, by Goeddel 
et al., Nucleic Acids Res. 8: 4057 (1980). 

Unless described otherwise, ligations were accomplished 
using standard buffers, incubation temperatures and times, 
approximately equimolar amounts of the DNA fragments to 
be ligated and approximately 10 units of T4 DNA ligase 
("ligase") per 0.5 microgram of DNA. 

The polynucleotide having the DNA sequence given in 
(SEQ ID NO: 1) was obtained from the sequencing of a 
library of clones of chromosomal DNA of Staphylococcus 
aureus WCUH 29 in E. coli. 

To obtain the polynucleotide encoding the ropoi.somerase 
III protein using the DNA sequence given in (SEQ ID NO: 
25 ^) typically a library of clones of chromosomal DNA of 
Staphylococcus aureus WCUH 29 in E. coli or some other 
suitable host is probed with a radiolabelled oligonucleotide, 
preferably a 17 mcr or longer, derived from the partial 
sequence. Clones carrying DNA identical to that of the probe 
can then be distinguished using high stringency washes. By 
sequencing the individual clones thus identified with 
sequencing primers designed from the original sequence it is 
then possible to extend the sequence in both directions to 
determine the full gene sequence. Conveniently such 
sequencing is performed using denamred double stranded 
DNA prepared from a plasmid clone. Suitable techniques are 
described by Maniatis, T., Fritsch, E. F. and Sambrook et al., 
MOLECUIAR CLONING, A LABORATORY MANUAL, 2nd 
Ed.; Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, N.Y. (1989). (see Screening By Hybridization 1.90 
and Sequencing Denatured Double-Stranded DNA Tem- 
plates 13.70). 

Example 1 

Isolation of DNA coding for Novel Topoisomerase III Pro- 

45 tein from Staphylococcus aureus 

The polynucleotide having the DNA .sequence given in 
(SEQ ID NO: 1) was obtained from a library of clones of 
chromosomal Dl^Aot Staphylococcus aureus inE. coli. In 
some cases the sequencing data from two or more clones 

51} containing overlapping Staphylococcus aureus DNA was 
used to construct the contiguous DNA sequence in (SEQ ID 
NO: 1). Libraries may be prepared Libraries may be pre- 
pared by routine methods, for example, Methods 1 and 2 
below. 

55 Total cellular DNA is isolated from Staphylococcus 
aureus strain WCUH 29 according to standard procedures 
and size -fractionated by either of two methods. 

Method 1 

Total cellular DNA is mechanically sheared by passage 
60 through a needle in order to .si/e- fraction ate according to 
standard procedures. DNA fragments of up to 11 kbp in si/e 
are rendered blunt by treatment with exonuclease and DNA 
polymerase, and EcoRI linkers added. Fragments are ligated 
into the vector Lambda ZapII that has been cut with EcoRl, 
65 the library packaged by standard procedures and E. coli 
infected with the packaged library. The library is amplified 
by standard procedures. 
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Method 2 10 microliter samples of DNAase treated RNA are reverse 

Total cellular DNA is partially hydrolyscd with a combi- transcribed using.a Superscript Preamplification System for 

nation of four restriction enzymes (Rsal, Pall, Alul and pj^gj strand cDNA Synthesis kit (Gibco BRL, Life 

Bshl235) and size-fractionated according to standard pro- Technologies) according to the manufacturers instructions. 1 

cedures. EcoRl linkers are bgated to the DNA and the 5 r 1 u 1 u 

fragments then ligated into the vector Lambda ZapH that nanogram of random hexaraers is used to pnme each reac- 

have been cut with EcoRI, the library packaged by standard tion. Controls without the addition of SuperScriptll reverse 

procedures, and £. coH infected with the packaged library. transcriptase are also run. Both +/-RT samples are treated 

The library is amplified by standard procedures. with RNaseH before proceeding to the PGR reaction. 

Example 2 10 

Characterization of Topoisomerase III Gene Expression ^he use of PCR and fluorogenic probes to determine the 

a) Isolation of Staphylococcus aureus WCUH29 RNA from ^^^^^^ ^ ^^^^^^^^ ^^^^ 
mfected tissue samples 

Infected tissue samples, in 2- ml cyro-strorage tubes, are 

removed from -80** C. storage into a dry ice ethanol bath. In 15 Specific sequence detection occurs by amplification of 

a microbiological safety cabinet the samples are disrupted tg^get sequences in the PE Applied Biosystems 7700 

up to eight at a time while the remaining samples are kept Sequence Detection System in the presence of an oligo- 

frozen in the dry ice ethanol bath. To disrupt the bacteria ^^^^^^^^^^ ^be labeled at the 5' and 3' ends with a reporter 

Within the tissue sample, 50-100 mg of the tissue is trans- , la .1 .1 / 1 u \ 

fered to a FastRNA lube containing a silica/ceramic matrix 20 ^"^"^'^^^ fluorescent dye, respectively (tQ probe), 

(BIOlOl). Immediately, 1 ml of extraction reagents which anneals between the two PCR pnmers. Only specific 

(FastRNA reagents, BIOlOl) are added to give a sample to product will be detected when the probe is bound between 

reagent volume ratio of approximately 1 to 20. The tubes are the primers. As PCR amplification proceeds, the 5'-nuclease 

shaken in a reciprocating shaker (FastPrep FP20, BIOlOl) at activity of Taq polymerase initially cleaves the reporter dye 

6000 rpm for 20-120 sec. The crude RNA preparation is 25 from the probe. 'ITie signal generated when the reporter dye 

extracted with chloroformnisoamyl alcohol, and precipitated is physically separated from the quencher dye is detected by 

with DEPC-treated/Isopropanol Precipitation Solution mea.suring the signal with an attached CCD camera. Each 

(BIOlOl). RNA preparations are stored in this isopropanol . generated equals one probe cleaved which corre- 

solution at -80^ C. if necessary. The RNAis pelleted (12,000 ^ , rfl^JIv.^ ..^o^h nnv r^.^t;^«c 

g for 10 min.), washed with 75% ethanol (v/v in DEPC- 30 '^''^ amphfication of one target strand PCR reactioi^ 

Treated water), air-dried for 5-10 mio, and resuspended in "^"^S PE Applied Biosystem TaqMan PCR 

0 1 ml of DEPC-lreated water. ^orc Reagent Kit according to the instructions supplied such 

QuaUty of the RNA isolated is assessed by running that each reaction contains 5 microliters lOx PCR Buffer 11, 

samples on 1% agarose gels. Ix TBE gels stained with 7 microliters 25 mM MgCl^, 5 microliters 300 nM forward 

elhidium bromide are used to visualise total RNA yields. To 35 primer, 5 microliters reverse primer, 5 microliters specific 

demonstrate the isolation of bacterial RNA from the infected FQ probe, 1 microliter each 10 mM DAVP, 10 mM dCKP, 10 

tissue Ix MOPS, 2-2M formaldehyde gels are run and mM dGTP and 20 mM dUTP, 13.25 microliters distilled 

vacuum blotted to Hybond-N (Amersham). The blot is then water, 0.5 microliters AmpErase UNG, and 0.25 microliters 

hybridized with a 32 P labelled ofigonucletide probe specific AmpliTaq DNA polymerase to give a total volume of 45 

to 16S rRNA of Staphylococcus aureus (K. Greisen, M. 40 microliters. 
Loeffelholz, A. Purohit and D. Leong. J. Clin. (1994) Micro- 
biol. 32 335-351). An oligonucleotide of the sequence: 

5'-gctcctaaaaggttactccaccggc-3' is used as a probe. The size Amplification proceeds under the following thermal 

of the hybridizing band is compared to that of control RNA cycling conditions: 50° C. hold for 2 minutes, 95° C. hold 

isolated from in vitro grown Staphylococcus aureus 45 for 10 minutes, 40 cycles of 95° C. for 15 seconds and 60° 

WCUH29 in the Northern blot. Correct sized bacterial 168 C. for 1 minute, followed by a 25° C. hold until sample is 

rRNA bands can be detected in total RNA samples which retrieved. Detection occurs real-time. Data is collected at the 

show extensive degradation of the mammalian RNA when end of the reaction, 
visualised on TBE gels. 

b) The removal of DNA from Staphylococcus aureus so 

WCUH29-derived RNA RT/PCR controls may include +/-reverse transcriptase 

DNA was removed from 50 ng samples of RNA by a 30 reactions, amplification along side genes known to be tran- 

minute treatment at 37** C. with 10 units of RNAase-free scribed under the conditions of study and amplification of 1 

DNAasel (Genellunter) in the buffer supplied in a final microgram of genomic DNA. 

volume of 57 microliters. 55 

TTie DNAase was inactivated and removed by phe- p^.^^^ corresponding probes which fail to 

nol:chloroform extraction. RNA was precipitated with 5 » • 1 • j-iKrA nr>n ot/dod d^^d e -i 

microliters of 3 M NaOAc and 200 microliters 100% EtOH, 8^"^^^^^ ^'f'^ ^.^f ^^R or RT/PCR are PCR failures 

and pelleted by centrifugation at 12,000 g for 10 minutes. ^'"^ f Tin ™ Ihose which generate 

The RNA is pelleted (12,000 g for 10 min.), washed with 60 ^'^^^^ DNA PCR, two classes are distinguished in 

75% ethanol (v/v in DEPC-treated water), air-dried lor 5-10 RT/PCR: 1. Genes which are not transcribed m vivo repro- 

min, and resuspended in 10-20 microliters of DEPC-treated tl^^cibly fail to generate signal in RT/PCR; and 2. Genes 

water. RNA yield is quantitated by OD.go after 1:1000 which are transcribed m vivo reproducibly generate signal in 

dilution of the cleaned RNA sample. RNA'is stored at -SO'' RT/PCR and show a stronger signal in the +RT samples than 

C. if necessary and reverse-transcribed within one week, c) 65 the signal (if at all present) in -RT controls. Based on these 

The preparation of cDNA from RNA samples derived fi"om analyses it was discovered that 5. aureus topoisomerase III 

infected tissue gene was expressed in vivo. 
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Primers used for Example 2 are as follows: 
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topB fwd primer GTTATACGATAT6ATTGTCGAGCGT [SEQ 
topB rev primer GTGCCCTGCAACCTCTAAACT [SEQ 
topB probe FAM-CCTCCGCACGAGTATGACGCG-TAMRA ISEQ 

FAM and TAMRA labeling of primers and the uses of k 
such primer have reporised (Lee, L G, ConncU, C R, and 
Bloch, W. 1993. Allelic discrimination by nick-translation 
PCR with fluorogenic probes. Nucleic Acids Research 



ID NO: 6} 
ID K0:71 
ID NO: 8} 

21:3761-3766; Livak, K J, Flood, S J A, Marmaro, J., Giusti, 
W, and Dcetz, K. 1995. Oligonucleotides with fluorescent 
dyes at opposite ends provide a quenched probe system 
useful for detecting PCR product and nucleic acid hybrid- 
ization. PCR Methods and Applications 4:357-362.). 



SEQUENCE LISTING 

(1) GENERAL INFOPMATION: 

(iii) NUMBER OP SEQUENCES: 8 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

TTCATTGTAC TGTTGAGGAA GTTTATATAA GTATGATGCT GATTCATAAT TCGAATGTTC 60 

AATGAACGAT TTTATTTGTG TAATATCATA TAACTGAACT ATGCTCATGT CATTACCTCC 120 

GTACTTTTTG TTACTTTTAT TATATAGTAT TTCAACTGAA ATGAAAGTTA ATAGTGATAT 180 

TAACATGTTA CAATACATTT AACACCATTG AATTTAAATC AAAGATTAGT GGAATAGATG 2 40 

AAGCACGTTC GAAATAAAAG AACGTATGAG AAAGGATAAT TTATGAAATC TTTAATATTA 300 

GCTGAAAAAC CATCAGTTGC AAGAGATATT GCTGATGCTT TACAAATAAA TCAGAAGCGT 3 60 

AATGGTTACT TTGAAAATAA CCAATATATT GTCACGTGGG CGTTAGGTCA TCTAGTGACA 420 

AATGCGACAC CTGAACAATA CGATAAAAAT TTAAAGGAAT GGCGATTAGA AGACCTTCCA 4 80 

ATTATACCTA AATATATGAA AACTGTTGTT ATTGGTAAAA CAAGCAAACA ATTTAAAACA 5 40 

GTAAAAGCGT TAATTTTAGA TAATAAAGTG AAAGATATTA TTATTGCAAC AGATGCTGGA 6 00 

CGAGAAGGTG AACTAGTTGC AAGATTGATT TT6GATAAAG TTGGTAACAA AAAGCCAATC 6 60 

CGTCGATTAT GGATTAGCTC AGTTACTAAA AAAGCTATTC AACAAGGTTT TAAAAATTTA 720 

AAAGACGGTC GTCAATATAA CGATTTGTAT TATGCAGCGT TAGCGAGAAG CGAGGCAGAT 780 

TGGATTGTTG GGATTAATGC AACGCGTGCA CTAACAACAA AGTATGATGC ACAGCTATCC 840 

CTGGGACGTG TTCAGACACC AACGATTCAA TTAGTAAATA CACGACAACA AGAGATTAAT 900 

CAGTTCAAAC CACAACAATA CTTTACATTA TCATTAACGG TAAAAGGGTT TGATTTTCAG 9 60 

CTAGAATCAA ATCAGCGATA TACCAATAAA GAAACTTTAG AACAGATGGT TAATAATTTG 1020 

AAAAATGTCG ATGGTAAGAT TAAATCTGTT GCTACTAAAC ATAAGAAGTC GTATCCGCAA 1080 

TCACTGTACA ATTTAACAGA TTTACAACAA GATATGTATA GACGTTATAA AATTGGACCT 1140 

AAAGAAACAT TGAATACACT TCAAAGCTTA TATGAGAGAC ATAAAGTCGT AACCTATCCA 1200 
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-continued 



AGAACAGATT CAAACTATTT AACAACTGAT ATGGTAGATA CTATGAAAGA ACGTATTCAG 1260 

GCGACGATGG CAACAACATA TAAAGACCAA GCACGCCCAT TAATGTCTAA AACATTTTCA 1320 

TCAAAAATGT CGATATTTAA TAATCAAAAA GTATCTGATC ACCATGCAAT TATTCCTACA 1380 

GAAGTGAGAC CTGTCATGTC AGACTTAAGT AATAGAGAAT TAAAGTTATA CGATATGATT 1440 

GTCGAGCGTT TTTTAGAAGC TTTAATGCCT CCGCACGAGT ATGACGCGAT AACTGTAACT 1500 

TTAGAGGTTG CAGGGCACAC ATTTGTTTTG AAAGAGAATG TAACAACTGT TTTAGGTTTT 1560 

AAATCTATTA GACAAGGTGA ATCTATTACA GAGATGCAAC AGCCTTTTTC AGAAGGCGAT 1620 

GAAGTGAAGA TTTCAAAAAC AAACATTAGA GAACATGAAA CAACACCTCC AGAATATTTT 1680 

AATGAAGGTT CGTTATTAAA AGCGATGGAG AACCCTCAGA ACTTTATTCA ATTGAAGGAT 174 0 

AAAAAATATG CGCAAACTTT AAAACAAACA GGTGGTATCG GCACAGTTGC AACAAGGGCC 1800 

GACATTATCG ATAAATTATT TAATATGAAT GCCATTGAAT CAAGAGACGG TAAAATTAAA 1860 

GTAACGTCAA AAGGTAAACA AATATTAGAA TTAGCACCA6 AAGAATTAAC GTCGCCACTT 1920 

TTAACTGCAC AATGGGAAGA AAAATTACTT TTAATTGAAC GTGGTAAATA TCAGGCGAAA 1980 

ACATTTATTA ATGAAATGAA AGATTTTACG AAAGATGTTG TAAATGGGAT TAAAAATAGT 2040 

GATCGTAAAT ATAAACACGA TAATTTAACA ACCACAGAAT GCCCAACGTG TGGTAAATTC 210 0 

ATGATTAAAG TTAAAACTAA AAATGGTCAG ATGCTTGTGT GCCAAGATCC ATCTTGTAAG 2160 

ACGAAAAAGA ATGTACAGCG CAAAACAAAT GCAAGATGTC CAAACTGTAA AAAGAAATTA 2220 

ACGTTGTTTG GTAAAGGGAA AGAAGCGGTA TATCGTTGTG TTTGTGGACA TTCTGAAACG 2280 

CAAGCACATA TGGATCAGCG TATGAAGTCT AAATCCTCTG GTAAAGTATC TCGTAAAGAA 2340 

ATGAAAAAGT ATATGAATAA AAATGAAGGT TTAGACAATA ATCCGTTTAA AGATGCATTA 2400 

AAGAACTTGA ATTTATAGAT AAAATCGAAC AAAGTTGAAT CAGAAAAACG AAAAGTTCGC 2 46 0 

TTTTGGTATT GTTTTTTATT AAGAATGATA TTAAACTATT AAGGTATTTT AAAAAAAGGA 2520 

GCATCCATTC GTGAAAAACT ATTTCCAGTT CGATAAATAT GGAACAAACT TTAAAAGAGA 2580 

AATCTTAGGC GGTATCACAA CTTTCTTATC TATGGCCTAT ATTTTAGCAG TTAACCCGCA 2640 

AGTTTTAAGT TTAGCAGGTG TTAAAGGCGT ATCAGAAGAT ATGAAAATGG ACCAAGGTGC 2700 

CATTTTTGTA GCGACTGCAT TAGCAGCATT TGTAGGCTCG CTATTCATGG GACTAATAGC 2760 

TAAATATCCA ATCGCATTAG CACCAGGTAT GGGATTGGAA TTC 2803 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 711 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION i SEQ ID N0!2: 

Met Lye Ser Leu lie Leu Ala Glu Lys Pro Ser Val Ala Arg Asp lie 
15 10 15 

Ala Asp Ala Leu Gin lie Asn Gin Lys Arg Asn Gly Tyr Phe Glu Asn 

20 25 30 

Asn Gin Tyr lie Val Thr Trp Ala Leu Gly His Leu Val Thr Asn Ala 
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40 



45 



Thr Pro 
50 



Glu Gin Tyr Asp Lys Asn Leu Lys Glu 
55 



Trp Arg Leu Glu Asp 
60 
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Leu Pro He He Pro Lys Tyr Met Lys Thr Val Val He Gly Lye Thr 
65 70 75 80 

Ser Lys Gin Phe Lys Thr Val Lys Ala Leu He Leu Asp Asn Lys Val 

85 90 95 

Lys Asp He He He Ala Thr Asp Ala Gly Arg Glu Gly Glu Leu Val 
100 105 110 

Ala Arg Leu He Leu Asp Lys Val Gly Asn Lys Lys Pro He Arg Arg 
115 120 125 

Leu Trp He Ser Ser Val Thr Lys Lys Ala He Gin Gin Gly Phe Lys 
130 135 140 

Asn Leu Lys Asp Gly Arg Gin Tyr Asn Asp Leu Tyr Tyr Ala Ala Leu 
145 150 155 160 

Ala Arg Ser Glu Ala Asp Trp He Val Gly He Asn Ala Thr Arg Ala 
165 170 175 

Leu Thr Thr Lys Tyr Asp Ala Gin Leu Ser Leu Gly Arg Val Gin Thr 
180 185 190 

Pro Thr He Gin Leu Val Asn Thr Arg Gin Gin Glu He Asn Gin Phe 
195 200 205 

Lys Pro Gin Gin Tyr Phe Thr Leu Ser Leu Thr Val Lys Gly Phe Asp 
210 215 220 

Phe Gin Leu Glu Ser Asn Gin Arg Tyr Thr Asn Lys Glu Thr Leu Glu 
225 230 235 240 

Gin Net Val Asn Asn Leu Lys Asn Val Asp Gly Lys He Lys Ser Val 
245 250 255 

Ala Thr Lys His Lys Lys Ser Tyr Pro Gin Ser Leu Tyr Asn Leu Thr 
260 265 270 

Asp Leu Gin Gin Asp Met Tyr Arg Arg Tyr Lys He Gly Pro Lys Glu 

275 280 285 

Thr Leu Asn Thr Leu Gin Ser Leu Tyr Glu Arg His Lys Val Val Thr 
290 295 300 

Tyr Pro Arg Thr Asp Ser Asn Tyr Leu Thr Thr Asp Met Val Asp Thr 
305 310 315 320 

Met Lys Glu Arg He Gin Ala Thr Met Ala Thr Thr Tyr Lys Asp Gin 
325 330 335 

Ala Arg Pro Leu Met Ser Lys Thr Phe Ser Ser Lys Met Ser He Phe 

340 345 350 

Asn Asn Gin Lys Val Ser Asp His His Ala He He Pro Thr Glu Val 
355 360 365 

Arg Pro Val Met Ser Asp Leu Ser Asn Arg Glu Leu Lys Leu Tyr Asp 

370 375 380 

Met He Val Glu Arg Phe Leu Glu Ala Leu Met Pro Pro His Glu Tyr 
385 390 395 400 

Asp Ala He Thr Val Thr Leu Glu Val Ala Gly His Thr Phe Val Leu 
405 410 415 

Lys Glu Asn Val Thr Thr Val Leu Gly Phe Lys Ser He Arg Gin Gly 
420 425 430 

Glu Ser He Thr Glu Met Gin Gin Pro Phe Ser Glu Gly Asp Glu Val 
435 440 445 

Lys He Ser Lys Thr Asn He Arg Glu His Glu Thr Thr Pro Fro Glu 
450 455 460 

Tyr Phe Asn Glu Gly Ser Leu Leu Lys Ala Met Glu Asn Pro Gin Asn 
465 470 475 480 

Phe He Gin Leu Lys Asp Lys Lys Tyr Ala Gin Thr Leu Lys Gin Thr 
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485 490 495 

Gly 6ly lie Gly Thr Val Ala Thr Arg Ala Asp He He Asp Lys Leu 
500 505 510 

Phe Asn Met Asn Ala He Glu Ser Arg Asp Gly Lys He Lys Val Thr 

515 520 525 

Ser Lys Gly Lys Gin He Leu Glu Leu Ala Pro Glu Glu Leu Thr Ser 
530 535 540 

Pro Leu Leu Thr Ala Gin Trp Glu Glu Lys Leu Leu Leu He Glu Arg 
545 550 555 560 

Gly Lys Tyr Gin Ala Lys Thr Phe He Asn Glu Met Lys Asp Phe Thr 
565 570 575 

Lys Asp Val Val Asn Gly He Lys Asn Ser Asp Arg Lys Tyr Lys His 

580 585 590 

Asp Asn Leu Thr Thr Thr Glu Cys Pro Thr Cys Gly Lys Phe Met He 
595 600 605 

Lys Val Lys Thr Lys Asn Gly Gin Met Leu Val Cys Gin Asp Pro Ser 
610 615 620 

Cys Lys Thr Lys Lys Asn Val Gin Arg Lys Thr Asn Ala Arg Cys Pro 
625 630 635 640 

Asn Cys Lys Lys Lys Leu Thr Leu Phe Gly Lys Gly Lys Glu Ala Val 
645 650 655 

Tyr Arg Cys Val Cys Gly His Ser Glu Thr Gin Ala His Met Asp Gin 
660 665 670 

Arg Met Lys Ser Lys Ser Ser Gly Lys Val Ser Arg Lys Glu Met Lys 
675 680 685 

Lys Tyr Met Asn Lys Asn Glu Gly Leu Asp Asn Asn Pro Phe Lys Asp 
690 695 700 

Ala Leu Lys Asn Leu Asn Leu 
705 710 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 
GAATGAGCCG CAACTTCGG6 AT 22 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TAAAAGAACG TATGAGAAAG 20 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 23 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: eingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
AAAAACAATA CCAAAAGC6A ACT 23 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
GTTATACGAT ATGATTGTCG AGCGT 25 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
GTGCCCTGCA ACCTCTAAAG T 21 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

CCTCCGCACG AGTATGACGC G 21 



What is claimed is: 

1. An isolated polynucleotide segment comprising: a first 
polynucleotide sequence, wherein the first polynucleotide 
sequence (a) is a reference sequence that encodes the amino 
acid sequence set forth in SEQ ID N0:2, or (b) has at least 
95% identity with the reference sequence, wherein said 
identity is determined using an algorithm selected from the 
group consisting of BLASTP, BLASTN or FASTA, where 
the algorithm is adapted to give the largest match between 
sequences tested, over the entire length of the reference 
sequence. 

2. An isolated polynucleotide segment according to claim 
1 comprising: a first polynucleotide sequence, wherein the 
first polynucleotide sequence (a) encodes a reference 
sequence that has the amino acid sequence set forth in SEQ 
ID N0:2, or (b) encodes a polypeptide sequence identical 
with the reference sequence except that, over the entire 
length corresponding to the reference sequence, the encoded 
polypeptide sequence has up to ten amino acid substitutions, 
amino acid deletions or amino acid insertions. 



3. The isolated polynucleotide segment of claim 1, com- 
5^ prising the first polynucleotide sequence wherein the first 

polynucleotide sequence is (a) identical with the reference 
sequence, or (b) has at least 97% identity with the reference 
sequence, wherein said identity is determined using an 
algorithm selected from the group consisting of BLASTP, 
55 BLASTN or FASTA, where the algorithm is adapted to give 
the largest match between sequences tested, over the entire 
length of the reference sequence, 

4. ITie isolated polynucleotide segment of claim 1, 
wherein the first polynucleotide sequence encodes a topoi- 

60 somerase 111 polypeptide. 

5. An isolated polynucleotide segment comprising the full 
complement of the entire length of the first polynucleotide 
sequence of claim 1. 

6. The isolated polynucleotide segment of claim 5, 
C5 wherein the first polynucleotide sequence (a) encodes a 

reference sequence that has the amino acid sequence set 
forth in SEQ ID N0:2, or (b) encodes a polypeptide 
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sequence identical with the reference sequence except that, 16. The isolated polynucleotide segment of claim 1, 

over the entire length corresponding to the reference wherein the first polynucleotide sequence is identical with a 

sequence, the encoded polypeptide sequence has up to five third reference sequence which (a) is nucleotides 283 to 

amino acid substitutions, amino acid deletions or amino acid 2415 inclusive of the polynucleotide sequence set forth in 

insertions. 5 SEQ ID N0:1, or (b) has at least 95% identity with the third 

I. Ilie isolated polynucleotide segment of claim 6, reference sequence, wherein said identity is determined 
wherein the first polynucleotide sequence encodes a lopoi- ^^jQg algorithm selected from the group consisting of 
somerase III polypeptide. BLASTP, BLASTN or FASTA, where the algorithm is 

8. A vector comprising the polynucleotide segment ot ^^^^^^^ ^^^^ j^^^^^j ^^^^^ ^^^^^^^ sequences tested, 

^ ^i"^ • - "^^ over the entire length of the third reference sequence; 

9. A vector comprismg the polynucleotide segment of 

claim 5 wherein the first polynucleotide sequence hybridizes 

10. An isolated host cell transformed with the polynucle- ^°^er stringent conditions to a polynucleotide sequence 
otide segment of claim 1 to express the first polynucleotide which is the full complement of said third reference 
sequence. 15 sequence, wherein stringent conditions means hybrid- 

II. Apfocess for producing an topoisomerase III polypep- ization will occur only if there is at least 95% identity 
tide of the first polynucleotide sequence comprising the step between the polynucleotide sequences to be hybridized, 
of culturing the host cell of claim 10 under conditions 17. An isolated polynucleotide segment of claim 16, 
sufficient for the production of said polypeptide, which is wherein the first polynucleotide sequence (a) is the sequence 
encoded by the first polynucleotide sequence. 20 from nucleotides 283 to 2415 inclusive of the polynucleotide 

12. An isolated polynucleotide segment comprising a first sequence set forth in SEQ ID NO:l, or (b) has at least 97% 
polynucleotide sequence, wherein the first polynucleotide identity with the third reference sequence, wherein said 
sequence is (a) a first reference sequence that encodes the identity is determined using an algorithm selected from the 
amino acid sequence set forth in SEQ ID N0:2, or (b) has group consisting of BLASTP, BLASTN or FASTA, where 
at least 90% identity with the first reference sequence, 25 the algorithm is adapted to give the largest match between 
wherein said identity is determined using an algorithm sequences tested, over the entire length of the third reference 
selected from the group consisting of BLASTP, BLASTN or sequence. 

FASTA, where the algorithm is adapted to give the largest ig. The isolated polynucleotide segment of claim 1, 

match between sequences tested, over the entire length of wherein the first polynucleotide sequence hybridizes under 

the first reference sequence 30 stringent conditions to a polynucleotide sequence which is 

wherein the first polynucleotide sequence is (a) a second the full complement of a third reference polynucleotide 

reference sequence which encodes the same mature having nucleotides 283 to 2415 of SEQ ID N0:1. 

poljff^eptide, expressed by the topoisomerase HI gene 19. A recombinant polynucleotide segment comprising 

contained in Staphylococcus aureus WCUH 29 con- nucleotides 283 to 2415 of the polynucleotide sequence set 

tained in NCIMB Deposit No. 40771, or (b) has at least forth in SEQ ID N0:1, or the full complement of the entire 

95% identity with the second reference sequence, lenglhof the nucleotide sequence set forth in SEQ ID NO: 1. 

wherein said identity is determined using an algorithm 20. A recombinant polynucleotide segment which encodes 

selected from the group consisting of BLASTP, a polypeptide comprising a region having the amino acid 

BLASTN or FASTA, where the algorithm is adapted to sequence of SEQ ID N0.2. 

give the largest match between sequences tested, over ^ 21. A vector comprising the recombinant polynucleotide 

the entire length of the second reference sequence. segment of claim 20. 

13. An isolated polynucleotide segment of claim 12, 22. An isolated host cell transformed with the recombi- 
wherein the first polynucleotide sequence is (a) the second nant polynucleotide segment of claim 20 to express the 
reference sequence, or (b) has at least 97% identity with the recombinant polynucleotide segment. 

second reference sequence, wherein said identity is deter- ^5 23. A process for producing a topoisomerase III polypep- 

mined using an algorithm selected from the group consisting tide of the polynucleotide sequence comprising the step of 

of BLASTP, BLASTN or FASTA, where the algorithm is culturing a host cell of claim 22 under conditions sufficient 

adapted to give the largest match between sequences tested, for the production of said polypeptide, 

over the entire length of the second reference sequence. 24. The isolated polynucleotide of claim 1, wherein said 

14. An isolated polynucleotide segment of claim 13 isolated polynucleotide encodes a topoisomerase III 
comprising a first polynucleotide sequence encoding the polypeptide that is involved in altering DNA topology in a 
same mature polypeptide expressed by the topoisomerase III bacterial cell. 

gene contained in Staphylococcus aureus WCUH 29 con- 25. The isolated polynucleotide of claim 16, wherein said 

tained in NCIMB Deposit No. 40771, or the full comple- isolated polynucleotide encodes a topoisomerase III 

ment of the entire length of such first polynucleotide 55 polypeptide that is involved in altering DNA topology in a 

sequence. bacterial cell. 

15. A polynucleotide encoding a fusion polypeptide 

including a polynucleotide segment according to claim 12. ***** 
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1 

HUMAN ENAA^ASP-LIKE PROTEIN SPLICE 

VARIANT 

This application is a divisional application of U.S. appli- 
cation Ser. No. 09/227,420, filed Jan. 8, 1999, issued as U.S. 
Pat. No. 5,990,087 which is a divisional of U.S. application 
Ser. No. 09/026,587; filed Feb. 20, 1998, issued Jun. 15, 
1999, as U.S. Pat. No. 5,912,128. 

FIELD OF THE INVENIION 

This invention relates to nucleic acid and amino acid 
sequences of a human ena/VASP-Kke protein splice variant 
and to the use of these sequences in the diagnosis, treatment, 
and prevention of reproductive, immunological, vesicle 
trafficking, nervous system, developmental, and neoplastic 
disorders. 

BACKGROUND OF THE INVENTION 

The control of ceU morphology and motility requires the 
coupling of external stimuli to processes that alter the 
cytoskeletal architecture. The mechanical forces that drive 
morphological change and migration arise initially from the 
microfilament-based cytoskeleton. A large body of evidence 
links various signal transduction pathways to the formation 
of cellular outgrowths. The migration of neuronal growth 
cones is a well-studied mechanism for the actin-driven 
formation of membrane protrusions. In one example, the 
processes of axonal outgrowth are mediated by the Droso- 
phila homolog of the c-Abl tyrosine is kinase (Abl) and the 
product of the Disabled gene (Dab). Homozygous mutants 
of Abl and Dab make few or no proper axonal connections. 
The defects caused by loss of Abl and Dab in Drosophila are 
ameliorated by mutations in the Enabled (Ena) gene. Ena 
protein is tyrosine phosphor ylated and has a proline-rich 
core which binds to the SH3 domains of Abl protein and Src 
protein in vitro. The murine homolog of Ena (Mena) and 
ena/VASP-like protein have recently been described and are 
members of a family of related molecules that include 
vasoactive-stimulated phosphoprotein (VASP). These pro- 
teins share three distinct regions of similarity: an amino- 
terminal 115 amino acids (EVHl domain); a proline-rich 
core; and a carboxy-terminal 226 amino acids (EVH2 
domain). Mena has phosphotyrosine and phosphoserine 
moieties and binds Abl and Src SH3 domains. (Gertler, F. B. 
et al. (1996) Cell 87:227-239.) 

Human platelet activation is inhibited by agents such as 
prostaglandins and nitric oxide donors, which elevate intra- 
cellular cAMP or cGMP levels. Activation of platelets is 
asociated with increased formation of intracellular F-actin. 
VASP is an abundant in vivo substrate for cyclic nucleotide- 
dependant protein kinases in platelets. VASP is a ligand for 
profihn, an actin-monomer binding protein that can stimu- 
late the formation of F-actin. VASP is organized into three 
distinct domains. A central proline-rich domain contains a 
GPPPPP motif as a single copy and as a 3-fold tandem 
repeat, as well as three conserved phosphorylation sites for 
cyclic nucleotide-dependent protein kinases. A C-terminal 
domain contains a repetitive mixed -charge cluster which is 
predicted to Form an alpha-helix. VASP expression in tran- 
siently transfected BHK21 cells was predominantly delected 
at stress fibers, at focal adhesions, and in F-actin-containing 
cell surface protrusions. In contrast, truncated VASP lacking 
the C-terminal domain was no longer concentrated at focal 
adhesions. These data indicate that the C-terminal domain is 
required for anchoring VASP at focal adhesion sites, while 
the central domain may mediate VASP interaction with 
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profilin. (Ermekova, K. S. et al (1997) J. Biol. Chem, 
272:32869-32877.) 

In comparison, Mena binds FE65, a neuronal protein 
which binds to the cytoplasmic portion of the p-arayloid 

5 precursor protein (p-APP). P-APP is a precursor to 
P-amyloid peptide, the major constituent of the extracellular 
plaques present in brain tissue from Alzheimer disease 
patients. Both VASP and Mena bind their respective adapter 
proteins (profilin or FE65) via distinct proline-rich regions 

10 and thus regulate adapter interaction(s) with other mol- 
ecules. (Ermekova, K. S. et al, supra.) Proline-rich domains 
and proline clusters have been identified in many proteins, 
particularly those that are associated with synaptic vesicles 
and other secretory organelles. These domains and clusters 

1^ act as protein-protein interaction modules. (Linial, M. 
(1994) Neuroreport. 5:2009-2015.) Two members of ena/ 
VASP-like proteins have been recently isolated from mouse 
and rat and share 98.5% sequence identity. (Gertler, supra; 
and Ohta, S. et al. (1997) Biochem. Biophys. Res. Comm. 

20 237:307-312.) 

The discovery of a new human ena/VASP-like protein 
splice variant and the polynucleotides encoding it satisfies a 
need in the art by providing new compositioijs which are 
useful in the diagnosis, treatment, and prevention of 
reproductive, immunological, vesicle trafficking, nervous 
system, developmental, and neoplastic disorders. 

SUMMARY OF THE INVENTION 

ITie invention is based on the discovery of a new human 
^ ena/VASP-like protein splice variant (EVLl), the polynucle- 
otides encoding EVLl, and the use of these compositions for 
the diagnosis, treatment, or prevention of reproductive, 
immunological, vesicle trafficking, nervous system, 
developmental, and neoplastic disorders. 

The invention features a substantially purified 
polypeptide, comprising the amino acid sequence of SEQ ID 
N0:1 or a fragment of SEQ ID N0:1. 

The invention further provides a substantially purified 
4Q variant having at least 90% amino acid identity to the 
sequence of SEQ ID NO:l or a fragment of SEQ ID N0:1. 
The invention also provides an isolated and purified poly- 
nucleotide encoding the polypeptide comprising the amino 
acid sequence of SEQ ID N0:1 or a fragment of SEQ ID 
45 N0:1. The invention also includes an isolated and purified 
polynucleotide variant having at least 90% polynucleotide 
identity to the polynucleotide encoding the polypeptide 
consisting of the sequence of SEQ ID N0:1 or a fragment of 
SEQ ID NO:l. 

50 Additionally, the invention provides a composition com- 
prising a polynucleotide encoding the polypeptide compris- 
ing the amino acid sequence of SEQ ID N0:1 or a fragment 
of SEQ ID N0:1. The invention further provides an isolated 
and purified polynucleotide which hybridizes under strin- 

55 gent conditions to the polynucleotide encoding the polypep- 
tide comprising the amino acid sequence of SEQ ID N0:1 
or a fragment of SEQ ID N0:1, as well as an isolated and 
purified polynucleotide which is complementary to the poly- 
nucleotide encoding the polypeptide consisting of the 

60 sequence of SEQ ID N0:1 or a fragment of SEQ ID N0:1. 
The invention also provides an isolated and purified 
polynucleotide comprising a sequence of SEQ ID N0:2 or 
a fragment of SEQ ID NO: 2, and an isolated and purified 
polynucleotide variant having at least 90% polynucleotide 

65 identity to the polynucleotide comprising the sequence of 
SEQ ID N0:2 or a fragment of SEQ ID N0:2. The invention 
also provides an isolated and purified polynucleotide which 
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is complementary to the polynucleotide comprising the amino acid sequence of SEQ ID NO: 1 or a fragment of SEQ 

sequence of SEQ ID NO:2 or a fragment of SEQ ID N0:2. ID N0:1 in a biological sample containing nucleic acids, the 

The invention further provides an expression vector con- method comprising the steps of: (a) hybridizing the comple- 

taining at least a fragment of the polynucleotide encoding ment of the polynucleotide encoding the polypeptide com- 

the polypeptide comprising the amino acid sequence of SEQ 5 prising the amino acid sequence of SEQ ID N0:1 or a 

ID N0:1 or a fragment of SEQ ID N0:1 . In another aspect, fragment of SEQ ID N0:1 to at least one of the nucleic acids 

the expression vector is contained within a host cell. j^e biological sample, thereby forming a hybridization 

The invention also provides a method for producing a complex; and (b) detecting the hybridization complex, 

polypeptide comprising the amino acid sequence of SEQ ID wherein the presence of the hybridization complex corre- 

N0:1 or a fragment of SEQ ID N0:1, the method compris- lu j^^^^ presence of a polynucleotide encoding the 

ing the steps of: (a) culturing the host cell containing an polypeptide in the biological sample. In one aspect, the 

expression vector contammg at le^^t a fragment of a poly- ^^^^^-^ ^^-^^ biological sample are amplified by the 

nucleotide encodmg the polypeptide under conditions suit- , . . . • * u u j- • 

r . ^. r.. , .-J , /. X polymerase chain reaction prior to the hybridizing Step, 

able for the expression of the polypeptide; and (b) recover- ^ ^ j s> t- 

ing the polypeptide from the host cell culnire. 15 

The invention also provides a pharmaceutical composi- 
tion consisting of a substantially purified polypeptide com- ____ 1^ .rx J ii- u *u J 
♦k^ .^.m ^^^.,^^^1 cnn in Jjn.i , FIGS. lA, IB, IC, ID, and IE show the ammo acid 

prising the amino acid sequence or 5>hQ ID NO:l or a ^^^^ 

fragment of SEQ ID N0:1 in conjunction with a suitable sequence (SEQ ID NOil) and nucleic aad sequence (SEQ 

pharmaceutical carrier. 20 ID N0:2) of EVLl. The aUgnment was produced using 

The invention ftirther includes a purified antibody which MACDMASIS PRO software (Hitachi Software Engineer- 
binds to a polypeptide consisting of the sequence of SEQ ID ^"S ' 

N0:1 or a fragment of SEQ ID N0:1, as well as a purified FIGS. 2A, 2B, and 2C show the amino acid sequence 
agonist and a purified antagonist of the polypeptide. alignments among EVLl (3089412; SEQ ID N0:1), mouse 
The invention also provides a method for treating or '^^ cnaAASP-hke protein (GI 1644453; SEQ ID NO:3), and 
preventing a reproductive disorder, the method comprising human VASP (GI 624964; SEQ ID N0:4), produced using 
administering to a subject in need of such treatment an the multi sequence alignment program of DNASTAR soft- 
effective amount of a pharmaceutical composition compris- ware (DNASTAR Inc., Madison Wiss.). 
ing substantially purified polypep^ comprising the ajnino pj^g, jAand 3B show the hydrophobicity plots for EVLl 
aad sequence of SEQ ID N0:1 or a fragment of SEQ ID ^^^^ ,p ^^.^^ ^^^^ cna/VASP-likc protein (SEQ ID 

' * . . , . , , . ^ . N0:3), respectively; the positive X axis reflects amino acid 

nic mvemion also provides a method for treating or -^^^^ ^^^^ y ^^g^^ hydrophobicity 

preventing an unmunological disorder, the method compris- (MACDMASIS PRO software), 
ing administering to a subject in need of such treatment an 

effective amount of a pharmaceutical composition compris- " ni-o/-Diimr>Ki ttjc iNrv/cKmrtM 

ing substantially purified polypeptide comprising the amino DESCRIPTION OF THE INVENTION 

sequence of SEQ ID NO: 1 era fragment of SEQ ID NO: 1. v, e .u . .■ i .m a 

1,. . , ., , , r Before the present proteins, nucleotide sequences, and 

nic mvcntion also provides a method for treating or ^^(^^^ described, it is understood that this invention is 

preventing a vesicle tratBcking disorder, the method com- jj^j,^^ ^^^^ particular methodology, protocols, cell 

pnsing administering to a subject in need of such treatment y^^^^ ^ described, as these may vary. It 

an effective amount of a phamiaceutical composition com- ^ understood that the terminology used herein is 

prising substantially punfied polypeptide a)mpnsing the ^^^^ describing particular embodiments only, 

ammo acid sequence of SEQ ID N0:1 or a fragment of SEQ ^ ^^ ^^^^ p^„, 

45 which will be limited only by the appended claims. 

The invention also provides a method for treating or 

preventing a nervous system disorder, the method compris- noted that as used herein and in the appended 

ing administering to a subject in need of such treatment an claims, the singular forms "a," "an," and "the" include plural 

effective amount of a pharmaceutical composition compris- reference unless the context clearly dictates otherwise. Thus, 

ing substantially purified polypeptide comprising the amino 5^ example, a reference to "a host cell" includes a plurality 

acid sequence of SEQ ID N0:1 or a fragment of SEQ ID ' of such host cells, and a reference to "an antibody" is a 

jsjQ.]^ reference to one or more antibodies and equivalents thereof 

The invention also provides a method for treating or ^"^wn to those skilled in the art, and so forth, 

preventing a developmental disorder, the method compris- Unless defined otherwise, all technical and scientific 

ing administering to a subject in need of such treatment an 55 terms used herein have the same meanings as commonly 

effective amount of a pharmaceutical composition compris- understood by one of ordinary skill in the art to which this 

ing substantially purified polypeptide comprising the amino invention belongs. Although any methods and materials 

acid sequence of SEQ ID N0:1 or a fragment of SEQ ID similar or equivalent to those described herein can be used 

in the practice or testing of the present invention, the 

The invention also provides a method for treating or 6n preferred methods, devices, and materials are now 
preventing a neoplastic disorder, the method comprising described. All publications mentioned herein are cited for 
administering to a subject in need of such treatment an the purpose of describing and disclosing the cell lines, 
effective amount of an antagonist of the polypeptide com- vectors, and methodologies which are reported in the pub- 
prising the amino acid sequence of SEQ ID N0:1 or a lications and which might be used in connection with the 
fragment of SEQ ID N0:1. 65 invention. Nothing herein is to be construed as an admission 

The invention also provides a method for detecting a that the invention is not entitled to antedate such disclosure 

polynucleotide encoding a polypeptide comprising the by virtue of prior invention. 
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DEFINITIONS (PCR) technologies well known in the art. (See, e.g., 

"EVLl." as used herein, refers to the amino acid Dieffenbach. C W. and G S^ Dveksler (1995) PC« P^^^^^^^^ 

sequences of substantially purified EVLl obtained from any Laboratory Manual, Cold Sprmg Harbor Press. Hainview, 

Species, particularly a mammalian species, including bovine, ' 

ovine, porcine, murine, equine, and preferably the human ^ The term "antagonist," as it is used herein, refers to a 

species, from any source, whether natural, synthetic, semi- molecule which, when bound to EVLl, decreases the 

synthetic or recombinant amount or the duration of the effect of the biological or 

' llte term "agonist," as used herein, refers to a molecule immunological activity of EVLL Antagonist may include 

which, when bound to EVLl, increases or prolongs the protetns, nucleic acids, carbohydrates^ ""^"^t^ °' 

duration of the efect of EVLl. AgonisLs may include '° other molecules which decrease the effect of EVLl. 

proteins, nucleic acids, carbohydrates, or any other mol- As used herein, the term "antibody" refers to intact 

eailes which bind to and modulate the effect of EVLl. molecules as well as to fragments thereof, such as Fab, 

An "allele" or an "allelic sequence," as the.se terms are ^(ab')^, and Fv fragments, which are capable of binding the 

u.sed herein, is an alternative form of the gene encoding epitopic determinant. Antibodies that bind EVLl polypep- 

EVLl. AUeles may result from at least one mutation in the "^es can be prepared using intact polypeptides or using 

nucleic acid sequence and may result in altered mRNAs or fragments contaming small peptides of interest as the immu- 

in polypeptides whose structure or function may or may not '"Z'ng an'-S^n. The polypeptide or oligopeptide used to 

be altered. Any given natural or recombinant gene may have immunize an annual (e.g., a mouse, a rat, or a rabbit) can be 

none, one, or many allelic forms. Common mutational ,„ '^^'^^'l fr"m the translation of RNA, or synthesized 

changes which give rise to alleles are generaUy ascribed to chemically, and can be a,njugated to a carrier protein if 

natural deletions, additions, or substitutions Of nucleotides. ^lesired. Commonly used carriers that are chemically 

Each of these types of changes may occur alone, or in coupled to peptides include bovine serum albumin, 

combination with the others, one or more times in a given thyroglobulin, and keyhole limpet hemocyanin (KLH). The 

sequence «; coupled peptide is then used to immunize the animal. 

"Altered" nucleic acid sequences encoding EVLl, as The term "antigenic determinant," as used herein, refers 
described herein, include those sequences with deletions, to t^^a* fragment of a molecule (i.e., an epitope) that makes 
insertions, or substitutions of different nucleotides, resulting contact with a particular antibody. When a protein or a 
in a polynucleotide the same EVLl or a polypeptide with at fragment of a protcm is used to immumzc a host animal, 
least one functional characteristic of EVLl. Included within 30 numerous regions of the protein may induce the production 
this definition are polymorphisms which may or may not be ^f antibodies which bind specifically to antigenic determi- 
readily delectable using a particular oligonucleotide probe of "^nts (given regions or three-dimensional structures on the 
the polynucleotide encoding EVLl, and improper or unex- protein). An antigenic determinant may compete with the 
pected hybridization to alleles, with a locus other than the intact antigen (i.e., the immunogen used to elicit the immune 
normal chromosomal locus for the polynucleotide sequence 35 response) for binding to an antibody, 
encoding EVLl. The encoded protein may also be "ahered," The term "antisense,*' as used herein, refers to any com- 
and may contain deletions, insertions, or substitutions of position containing a nucleic acid sequence which is 
amino acid residues which produce a silent change and complementary to a specific nucleic acid sequence. The term 
result in a functionally equivalent EVLl. Deliberate amino "antisense strand" is used in reference to a nucleic acid 
acid substitutions may be made 00 the basis of similarity in 40 strand that is complementary to the "sense" strand. Anti- 
polarity, charge, solubility, hydrophobicity, hydrophilicity, sense molecules may be produced by any method including 
and/or the amphipathic nature of the residues, as long as the synthesis or transcription. Once introduced into a cell, the 
biological or immunological activity of EVLl is retained. complementary nucleotides combine with natural sequences 
For example, negatively charged amino acids may include produced by the cell to form duplexes and to block either 
aspartic acid and glutamic acid, positively charged amino 45 transcription or translation. The designation "negative" can 
acids may include lysine and arginine, and amino acids with refer to the antisense strand, and the designation "positive" 
uncharged polar head groups having similar hydrophilicity can refer to the sense strand. 

values may include leucine, isoleucine, and valine; glycine As used herein, the-term "biologically active," refers to a 

and alanine; asparagint and glutamine; serine and threonine; protein having structural, regulatory, or biochemical func- 

and phenylalanine and tyrosine. 50 tions of a naturally occurring molecule. Likewise, "immu- 

The terms "amino acid" or "amino acid sequence," as nologically active" refers to the capability of the natural, 

used herein, refer to an oligopeptide, peptide, polypeptide, recombinant, or synthetic EVLl, or of any oligopeptide 

or protein sequence, or a fragment of any of these, and to thereof, to induce a specific immune response in appropriate 

naturally occurring or synthetic molecules. In this context, animals or cells and to bind with specific antibodies, 

"fragments", "immunogenic fragments", or "antigenic frag- 55 The terms "complementary" or "complementarity," as 

ments" refer to fragments of EVLl which are preferably used herein, refer to the natural binding of polynucleotides 

about 5 to about 15 amino acids in length and which retain under permissive salt and temperature conditions by base 

some biological activity or immunological activity of EVLl. pairing. For example, the sequence "A — G — T" binds to the 

Where "amino acid sequence" is recited herein to refer to an complementary sequence "T — C — A."Complementarity 

amino acid sequence of a naturally occurring protein so between two single-stranded molecules may be "partial," 

molecule, "amino acid sequence" and like terms are not such that only some of the nucleic acids bind, or it may be 

meant to limit the amino acid sequence to the complete "complete," such that total complementarity exists between 

native amino acid sequence associated with the recited the single stranded molecules. The degree of complemen- 

protein molecule. tarity between nucleic acid strands has significant effects on 

"Amplification," as used herein, relates to the production 65 the efficiency and strength of the hybridization between the 

of additional copies of a nucleic acid sequence. Amplifica- nucleic acid strands. This is of particular importance in 

tion is generally carried out using polymerase chain reaction amplification reactions, which depend upon binding 
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between nucleic acids strands, and in the design and use of 
peptide nucleic acid (PNA) molecules. 

A "composition comprising a given polynucleotide 
sequence" or a "composition consisting of a given amino 
acid sequence," as these terms are used herein, refer broadly 
to any composition containing the given polynucleotide or 
amino acid sequence. The composition may comprise a dry 
formulation, an aqueous solution, or a sterile composition. 
Compositions comprising polynucleotide sequences encod- 
ing EVLl or fragments of EVLl may be employed as 
hybridization probes. The probes may be stored in frccze- 
dried form and may be associated with a stabilizing agent 
such as a carbohydrate. In hybridizations, the probe may be 
deployed in an aqueous solution containing salts (e.g., 
NaCl), detergents (e.g., SDS), and other components (e.g., 
Denhardt's solution, dry milk, salmon sperm DNA, etc.). 

The phrase "consensus sequence,^' as used herein, refers 
to a nucleic acid sequence which has been resequenced to 
resolve uncalled bases, extended using XL-PCR (Perk in 
Blmer, Norwalk, Conn.) in the 5' and/or the 3' direction, and 
resequenced, or which has been assembled from the over- 
lapping sequences of more than one Incyte Clone using a 
computer program for fragment assembly, such as the GEL- 
VIEW Fragment Assembly system (GCG, Madison, Wiss.). 
Some sequences have been both extended and assembled to 
produce the consensus sequence. 

As used herein, the term "correlates with expression of a 
polynucleotide" indicates that the detection of the presence 
of nucleic acids, the same or related to a nucleic acid 
sequence encoding EVLl, by northern analysis is indicative 
of the presence of nucleic acids encoding EVLl in a sample, 
and thereby correlates with expression of the transcript from 
the polynucleotide encoding EVLl. 

A "deletion," as the term is used herein, refers to a change 
in the amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative," as used herein, refers to the chemi- 
cal modification of EVLl, of a polynucleotide sequence 
encoding EVLl, or of a polynucleotide sequence comple- 
mentary to a polynucleotide sequence encoding EVLl. 
Chemical modifications of a polynucleotide sequence can 
include, for example, replacement of hydrogen by an alkyl, 
acyl, or amino group. A derivative polynucleotide encodes a 
polypeptide which retains at least one biological or immu- 
nological function of the naniral molecule. A derivative 
polypeptide is one modified by glycosylation, pegylation, or 
any similar process that retains at least one biological or 
immunological function of the polypeptide from which it 
was derived. 

The term "homology," as used herein, refers to a degree 
of complementarity. There may be partial homology or 
complete homology. The word "identity" may substitute for 
the word "homology." A partially complementary sequence 
that at least partially inhibits an identical sequence from 
hybridizing to a target nucleic acid is referred to as "sub- 
stantially homologous." The inhibition of hybridization of 
the completely complementary sequence to the target 
sequence may be examined using a hybridization assay 
(Southern or northern blot, solution hybridization, and the 
like) under conditions of reduced stringency. A substantially 
homologous sequence or hybridization probe will compete 
for and inhibit the binding of a completely homologous 
sequence to the target sequence under conditions of reduced 
stringency. This is not to say that conditions of reduced 
stringency are such that non-specific binding is permitted, as 
reduced stringency conditions require that the binding of 



two sequences to one another be a specific (i.e., a selective) 
interaction. The absence of non-specific binding may be 
tested by the use of a second target sequence which lacks 
even a partial degree of complementarity (e.g., less than 
5 about 30% homology or identity). In the absence of non- 
specific binding, the substantially homologous sequence or 
probe will not hybridize to the second non-complementary 
target sequence. 

llie phrases "percent identity" or "% identity" refer to the 
10 percentage of sequence similarity found in a comparison of 
two or more amino acid or nucleic acid sequences. Percent 
identity can be determined electronically, e.g., by using the 
MEGALIGN program (Lasergene software package, 
DNASTAR, Inc., Madison Wiss.). The MEGALIGN pro- 
15 gram can create alignments between two or more sequences 
according to different methods, e.g., the clustal Method. 
(Higgins, D. G. and P. M. Sharp (1988) Gene 73:237-244.) 
The clustal algorithm groups sequences into clusters by 
examining the distances between all pairs. The clusters are 
20 aligned pairwise and then in groups. The percentage simi- 
larity between two amino acid sequences, e.g., sequence A 
and sequence B, is calculated by dividing the length of 
sequence A, minus the number of gap residues in sequence 
A, minus the number of gap residues in sequence R, into the 
sum of the residue matches between sequence A and 
sequence B, times one hundred. Gaps of low or of no 
homology between the two amino acid sequences are not 
included in determining percentage similarity crccnt identity 
between nucleic acid sequences can also be calculated by the 
clustal method, or by other methods known in the art, such 
as the Jo tun Hein Method. (See, e.g., Hein, J. (1990) 
Methods in Enzymology 183:626-645.) Identity between 
sequences can also be determined by other methods known 
in the art, e.g., by varying hybridization conditions. 

"Human artificial chromosomes" (HACs), as described 
herein, are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size, and which 
contain all of the elements required for stable mitotic 
chromosome segregation and maintenance. (See, e.g., 
Harrington, J. .7. et al. (1997) Nat Genet. 15:345-355.) 

The term "humanized antibody," as used herein, refers to 
antibody molecules in which the amino acid sequence in the 
non-antigen binding regions has been altered so that the 
antibody more closely resembles a human antibody, and still 
retains its original binding ability. 

"Hybridization," as the term is used herein, refers to any 
proce.ss by which a strand of nucleic acid binds with a 
complementary strand through base pairing. 

As used herein, the term "hybridization complex" as used 
herein, refers to a complex formed between two nucleic acid 
sequences by virtue of the formjttion of hydrogen bonds 
between complementary bases. A hybridization complex 
may be formed in solution (e.g., CqI or Rot analysis) or 
formed between one nucleic acid sequence present in solu- 
tion and another nucleic acid sequence immobilized on a 
solid support (e.g., paper, membranes, filters, chips, pins or 
glass slides, or any other appropriate substrate to which cells 
or their nucleic acids have been fixed). 
60 The words "insertion" or "addition," as used herein, refer 
to changes in an amino acid or nucleotide sequence resulting 
in the addition of one or more amino acid residues or 
nucleotides, respectively, to the sequence found in the 
naturally occurring molecule. 
65 "Immune response" can refer to conditions associated 
with inflammation, trauma, immune disorders, or infectious 
or genetic disease, etc. These conditions can be character- 
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ized by expression of various factors, e.g., cytokines, an antibody is specific for epitope "A," the presence of a 

chemokines, and other signaling molecules, which may polypeptide containing the epitope A, or the presence of free 

affect cellular and systemic defense systems. unlabeled A, in a reaction containing free labeled A and the 

Tlie term "microarray," as used herein, refers to an array ^"^'^^^y '^^""^^ ^^^^^^ ^ ^^^^ ^i"^^ ^« 

of distinct polynucleotides or oligonucleotides arrayed on a 5 the antibody, , „ ^ 

substrate, such as paper, nylon or any other type of As used herein, the term strmgent conditions refers to 

membrane, filter, chip, glass slide, or any other suitable solid ^'on JiUons which permit hybridr/.alion between polynucle- 

otide sequences and the claimed polynucleotide sequences. 

" * ' c Suitably stringent conditions can be defined by, for example. 

The term "modulate," as it appears herem, refers to a the concentrations of sah or formamide in the prchybridiza- 

change in the activity of EVLl. For example, modulation ^^^^ hybridization solutions, or by the hybridization 

may cause an increase or a decrease in protein activity, temperature, and are well known in the art. In particular, 

binding characteristics, or any other biological, functional, stringency can be increased by reducing the concentration of 

or immunological properties of EVLl. salt, increasing the concentration of formamide, or raising 

The phrases "nucleic acid" or "nucleic acid sequence," as the hybridization temperature, 

used herein, refer to an oligonucleotide, nucleotide, For example, hybridization under high stringency condi- 

polynucleotide, or any fragment thereof, to DN A or RNA of lions could occur in about 50% formamide at about 37** C. 

genomic or synthetic origin which may be single -stranded or to 42° C. Hybridization could occur under reduced strin- 

double-stranded and may represent the sense or the antisense gency conditions in about 35% to 25% formamide at about 

strand, to peptide nucleic acid (PNA), or to any DNA-like or 30° C. to 35'' C. In particular, hybridization could occur 

RNA-like material In this context, "fragments" refers to under high stringency conditions at 42*^ C. in 50% 

those nucleic acid sequences which are greater than about 60 formanide, 5xSSPE, 03% SDS, and 200 //g/ml sheared and 

nucleotides in length, and most preferably are at least about denatured salmon sperm DNA. Hybridization could occur 

100 nucleotides, at least about 1000 nucleotides, or at least under reduced stringency conditions as described above, but 

about 10,000 nucleotides in length. in 35% formamide at a reduced temperature of 35° C. The 

The terms "opcrably associated" or "operably linked," as " temperature range corresponding to a particular level of 

used herein, refer to functionally related nucleic acid stringency can be further narrowed by calculating the purine 

sequences. A promoter is operably associated or operably to pyrimidine ratio of the nucleic acid of interest and 

linked with a coding sequence if the promoter controls the adjusting the temperature accordingly. Variations on the 

transcription of the encoded polypeptide. While operably above ranges and conditions are well known in the art. 

associated or operably linked nucleic acid sequences can be The term "substantially purified," as used herein, refers to 

contiguous and in reading frame, certain genetic elements, nucleic acid or amino acid sequences that are removed from 

e.g., repressor genes, are not contiguously linked to the their natural environment and are isolated or separated, and 

encoded polypeptide but still bind to operator sequences that are at least about 60% free, preferably about 75% free, and 

control expression of the polypeptide. most preferably about 90% free from other components with 

The term "oligonucleotide," as used herein, refers to a which they are naturally associated, 

nucleic acid sequence of at least about 6 nucleotides to 60 A "substitution," as used herein, refers to the replacement 

nucleotides, preferably about 15 to 30 nucleotides, and most of one or more amino acids or nucleotides by different amino 

preferably about 20 to 25 nucleotides, which can be used in acids or nucleotides, respectively. 

PGR amplification or in a hybridization assay or microarray. "^lYansformation," as defined herein, describes a process 

As used herein, the term "oligonucleotide" is substantially which exogenous DNA enters and changes a recipient 

equivalent to the terms "arnplimers," "primers," cell. Transformation may occur under natural or artificial 

"oligomers," and "probes," as these terms are commonly conditions according to various methods well known in the 

defined in the art. art, and may rely on any known method for the insertion of 

"Peptide nucleic acid" (PNA), as used herein, refers to an 45 foreign nucleic acid sequences into a prokaryotic or eukary- 

antisense molecule or anti-gene agent which comprises an otic host cell. The method for transformation is selected 

oligonucleotide of at least about 5 nucleotides in length based on the type of host cell being transformed and may 

linked to a peptide backbone of amino acid residues ending include, but is not limited to, viral infection, electroporation, 

in lysine. The terminal lysine confers solubility to the heat shock, lipofection, and particle bombardment. The term 
composition. PNAs preferentially bind complementary 50 "transformed" cells includes stably transformed cells in 

single stranded DNA and RNA and stop transcript which the inserted DNA is capable of replication either as an 

elongation, and may be pegylatcd to extend their lifespan in autonomously replicating plasmid or as part of the host 

the cell. (See, e.g., Nielsen, P. E. et al. (1993) Anticancer chromosome, and refeis to cells which transiently express 

Drug Des, 8:53-63.) the inserted DNA or RNA tbr limited periods of time. 

The term "sample," as used herein, is used in its broadest 55 A "variant" of EVLl, as used herein, refers to an amino 

sense. A biological sample suspected of containing nucleic acid sequence that is altered by one or more amino acids, 

acids encoding EVLl, or fragments thereof, or EVLl itself The variant may have "conservative" changes, wherein a 

may comprise a bodily fluid; an extract from a cell, substituted amino acid has similar structural or chemical 

chromosome, organelle, or membrane isolated from a cell; a properties (e.g., replacement of leucine with isoleucine). 
cell; genomic DNA, RNA, or cDNA, in solution or bound to 60 More rarely, a variant may have "nonconservalive" changes 

a solid support; a tissue; a tissue print; etc. (t^-g-, replacement of glycine with tryptophan). Analogous 

As used herein, the terms "specific binding" or "specifi- minor variations may also include amino acid deletions or 

cally binding" refer to that interaction between a protein or insertions, or both. Guidance in determining which amino 

peptide and an agonist, an antibody, or an antagonist. The acid residues may be substituted, inserted, or deleted without 
interaction is dependent upon the presence of a particular 65 abohshing biological or immunological activity may be 

structure of the protein recognized by the binding molecule found using computer programs well known in the art, for 

(i.e., the antigenic determinant or epitope). For example, if example, DNASTAR software. 
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The Invention preferably at least about 90%, and most preferably at least 

. .......J. c L about 95% polynucleotide sequence identity to the poly- 

^IZ'vT r"" "^^ZV.f.^ """"^ nucleotide sequence encoding EVLl. A particular aspect of 
enaA^ASP-hke protein splice variant (EVLl), the polynucle- invention en compas^ses a variant of SEQ ID N0:2 which 
otides encoding EVLl, and the use of these compositions for ^ j^^^ ^^^^^ g^^^^ ^^^^ preferably at least about 90%, 
the diagnosis, treatment, or prevention of reproductive, ^^^^ preferably at least about 95% polynucleotide 
immunological, vesicle trafficking, nervous system, sequence identity to SEQ ID N0:2. Any one of the poly- 
developmental, and neoplastic disorders. nucleotide variants described above can encode an amino 

Nucleic acids encoding the EVLl of the present invention acid sequence which contains cit least one functional or 

were first identified in Incyle Clone 3089412 from the aorta structural characteristic of EVLl. 

cDNA library (HEAONOT03) using a computer search for It will be appreciated by those skilled in the art that as a 

amino acid sequence alignments. A consensus sequence, result of the degeneracy of the genetic code, a multitude of 

SEQ ID N0:2, was derived from the following overlapping polynucleotide sequences encoding EVLl, some bearing 

and/or extended nucleic acid sequences: Incyte Clones minimal homology to the polynucleotide sequences of any 

3089412 (HEAONOT03), 2836864 (TLYMNOT03), known and naturally occurring gene, may be produced. 

1822064 (GBLATUTOl), 1446806 (PLACNOT02), Thus, the invention contemplates each and every possible 

1556238 (BLADTUT04), 1209813 (BRSTNOT02), and the variation of polynucleotide sequence that could be made by 

shotgun sequence SAEA02787. selecting combinations based on possible codon choices. 

In one embodiment, the invention encompasses a These combinations are made in accordance with the stan- 

polypeptide consisting of the amino acid sequence of SEQ ^^^^ triplet genetic code as applied to the polynucleotide 

ID N0:1, as shown in FIGS. lA, IB, IC, ID, and IE. EVLl sequence of naturally occurring EVLl, and all such varia- 

is 418 amino acids in length and has two potential tions are to be considered as being specifically disclosed. 

N-glycosylation sites at residues N64, and N319; one poten- Although nucleotide sequences which encode EVLl and 

tial cAMP- and cGMP-dependent protein kinase phospho- its variants are preferably capable of hybridizing to the 
ryiation site at residue S160; eight potential casein kinase II 25 nucleotide sequence of the naturally occurring EVLl under 

phosphorylation sites at residues S96, SI 32, S215, S216, appropriately selected c*onditions of stringency, it may be 

S253, S285, S298, and S371; six potential protein kinase C advantageous to produce nucleotide sequences encoding 

phosphorylation sites at residues T21, S22, T48, S123, EVLl or its derivatives possessing a substantially different 

T252, and S287; and a proline cluster from Jibout residue codon usage. Codons may be selected to increase the rate at 
P346 to about residue P368 and a predicted tum-coil-tum 3Q which expression of the peptide occurs in a particular 

three-fold repeat structure from about residue T345 to about prokaryotic or cukaryotic host in accordance with the frc- 

residuc D374, indicative of a protein-protein interacting quency with which particular codons arc utilized by the host, 

domain. As shown in FIGS. 2A, 2B, and 2C, EVLl has Other reasons for substantially altering the nucleotide 

chemical and structural homology with mouse enaA^ASP- sequence encoding EVLl and its derivatives without alter- 
like protein (G1644453; SEQ ID N0:3), and human VASP 35 ing the encoded amino acid sequences include the produc- 

(GI 624964; SEQ ID N0:4). In particular, EVLl and mouse tion of RNA transcripts having more desirable properties, 

enaA^ASP-iike protein share 92% identity, two potential such as a greater half -life, than transcripts produced from the 

N-glycosyiation sites, one potential cAMP- and cGMP- naturally occurring sequence. 

dependent protein kinase phosphorylation site, seven poten- The invention also encompasses production of DNA 

tial casein kinase 11 phosphorylation sites, and six potential sequences which encode EVLl and EVLl derivatives, or 

protein kinase C phosphorylation sites. In addition, EVLl fragments thereof, entirely by synthetic chemistry. After 

and mouse ena/VASP-like protein have similar isoelectric production, the synthetic sequence may be inserted into any 

points, 9.2 and 8.7, respectively. As illustrated by FIGS. 3A of the many available expression vectors and cell systems 

and 3B, EVLl and mouse enaAASP-Hke protein have rather using reagents that are well known in the art. Moreover, 
similar hydrophobicity plots. The fragment of SEQ ID N0:2 45 synthetic chemistry may be used to introduce mutations into 

from about nucleotide 1161 to about nucleotide 1197 is a sequence encoding EVLl or any fragment thereof, 

useful for designing oligonucleotides or to be used directly Also encompassed by the invention are polynucleotide 

as a hybridization probe. Northern analysis shows the sequences that are capable of hybridizing to the claimed 

expression of this sequence in various libraries, at least 49% polynucleotide sequences, and, in particular, to those shown 
of which are immortalized or cancerous and at least 26% of 5^ in SEQ ID N0:2, or a fragment of SEQ ID N0:2, under 

which involve immune response. Of particular note is the various conditions of stringency. (See, e.g., Wahl,G. M. and 

expression of EVLl in gastrointestinal, cardiovascular, s. L. Berger (1987) Methods Enzymol. 152:399-407; and 

neural, and developmental tissue; and in prostate, breast, Kimmel, A. R. (1987) Methods Enzymol. 152:507-511.) 

ovary, and uterus tissue. Methods for DNA sequencing are well known and gen- 

The invention also encompasses EVLl variants. A pre- 55 erally available in the art and may be used to practice any of 
ferred EVLl variant is one which has at least about 80%, the embodiments of the invention. The methods may employ 
more preferably at least about 90%, and most preferably at such enzymes as the Klenow fragment of DNA polymerase 
least about 95% amino acid sequence identity to the EVLl ^ SEQUENASE (US Biochemical Corp., Cleveland, Ohio.), 
amino acid sequence, and which contains at least one Xaq polymerase (Perkin Elmer), thermostable il poly- 
functional or structural characteristic of EVLl. merase ( Amersham, Chicago, 111.), or cximbinations of poly- 

The invention also encompasses polynucleotides which merases and proofreading exonucleases such those found 

encode EVLl. In a particular embodiment, the invention in the ELONGASE amplification system (GIBCO/BRL, 

encompasses a polynucleotide sequence comprising the Gaithersburg, Md.). Preferably, the process Ls automated 

sequence of SEQ ID N0:2, which encodes an EVLl. with machines such as the microlab 2200 (Hamilton, Reno, 

The invention also encompasses a variant of a polynucle- 65 Nev.), Peltier Thermal-Cycler (PTC200; MJ Research, 

otide sequence encoding EVLl. In particular, such a variant Watertown, Mass.) and the ABI catalyst and 373 and 377 

polynucleotide sequence will have at least about 80%, more DNA sequencers (Perkin Elmer). 



us 6,6. 

13 

The nucleic acid sequences encoding EVLl may be 
extended utilizing a partial nucleotide sequence and employ- 
ing various methods known in the art to detect upstream 
sequences, such as promoters and regulatory elements. For 
example, one method which may be employed, restriction- 
site PCR, uses universal primers to retrieve unknown 
sequence adjacent to a known locus. (See, e.g., Sarkar, G. 
(1993) PCR Methods Applic. 2:318--322.) In particular, 
genomic DNA is first amplified in the presence of a primer 
complementary to a Hnker sequence within the vector and a 
primer specific to the region predicted to encode the gene. 
The amplified sequences are then subjected to a second 
round of PCR with the same linker primer and another 
specific primer internal to the first one. Products of each 
round of PCR are transcribed with an appropriate RNA 
polymerase and sequenced using reverse transcriptase. 

Inverse PCR may also be used to amplify or extend 
sequences using divergent primers based on a known region. 
(See, e.g., Triglia, T. et al. (1988) Nucleic Acids Res. 
16:8186.) The primers may be designed using commercially 
available software such as OLIGO 4.06 primer analysis 
software (National Biosciences Inc., Plymouth, Mion.) or 
another appropriate program to be about 22 to 30 nucle- 
otides in length, to have a GC content of about 50% or more, 
and to anneal to the target sequence at temperatures of about 
68"* C. to 72** C. The method uses several restriction 
enzymes to generate a suitable fragment in the known region 
of a gene. The fragment is then circularized by intramolecu- 
lar ligation and used as a PCR template. 

Another method which may be used is capture PCR, 
which involves PCR amplification of DNA fragments adja- 
cent to a known sequence in human and yeast artificial 
chromcxsome DNA. (See, e.g., Lagerstrom, M. et al. (1991) 
PCR Methods Applic. 1:111-119.) In this methcxl, multiple 
restriction enzyme digestions and ligations may be used to 
place an engineered double -stranded sequence into an 
unknown fragment of the DNA molecule before performing 
PCR. Other methods which may be used to retrieve 
unknown sequences are known in the art. (See, e.g., Parker, 
J. D. et al. (1991) Nucleic Acids Res 19:3055-3060.) 
Additionally, one may use PCR, nested primers, and PRO- 
MOTER FINDER libraries to walk genomic DNA 
(Clontech, Palo Alto, Cahf.). This process avoids the need to 
screen libraries and is useful in finding intron/exon junc- 
tions. 

When screening for full-length CDNAs, it is preferable to 
use libraries that have been size-selected to include larger 
cDNAs. Also, random-primed libraries are preferable in that 
they will include more sequences which contain the 5' 
regions of genes. Use of a randomly primed library may be 
especially preferable for situations in which an oligo d(T) 
library does not yield a full-length cDNA. Genomic libraries 
may be useful for extension of sequence into 5* non- 
transcribed regulatory regions. 

Capillary electrophoresis systems which are commer- 
cially available may be used to analyze the size or confirm 
the nucleotide sequence of sequencing or PCR products. In 
particular, capillary sequencing may employ flowable poly- 
mers for electrophoretic separation, four different fluores- 
cent dyes (one for each nucleotide) which are laser activated, 
and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be con- 
verted to electrical signal using appropriate software (e.g., 
GENOTYPER and SEQUENCE NAVIGATOR, Perkin 
Elmer), and the entire process from loading of samples to 
computer analysis and electronic data display may be com- 
puter controlled. Capillary electrophoresis is especially pref- 
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erable for the sequencing of small pieces of DNA which 
might be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide 
sequences or fragments thereof which encode EVLl may be 

5 used in recombinant DNA molecules to direct expression of 
EVLl, or fragments or functional equivalents thereof, in 
appropriate host cells. Due to the inherent degeneracy of the 
genetic code, other DNA sequences which encode substan- 
tially the same or a functionally equivalent amino acid 

10 sequence may be produced, and these sequences may be 
used 10 clone and express EVLl. 

As will be understood by those of skill in the art, it may 
be advantageous to produce EVLl -encoding nucleotide 
sequences possessing non-naturally occurring codons. For 

■^^ example, codons preferred by a particular prokaryotic or 
eukaryotic host can be selected to increase the rate of protein 
expression or to produce an RNA transcript having desirable 
properties, such as a half-life which is longer than that of a 
transcript generated from the naturally occurring sequence. 

The nucleotide sequences of the present invention can be 
engineered using methods generally known in the art in 
order to alter EVLl -encoding sequences for a variety of 
reasons including, but not limited to, alterations which 

2^ modify the cloning, processing, and/or expression of the 
gene product. DNA shufOing by random fragmentation and 
PCR reassembly of gene fragments and synthetic oligo- 
nucleotides may be used to engineer the nucleotide 
sequences. For example, site-directed mutagenesis may be 
used to insert new restriction sites, alter glycosylalion 
patterns, change codon preference, produce splice variants, 
introduce mutations, and so forth. 

In another embodiment of the invention, natural, 
modified, or recombinant nucleic acid sequences encoding 

35 EVLl may be ligated to a heterologous sequence to encode 
a fusion protein. For example, to screen peptide libraries for 
inhibitors of EVLl activity, it may be useful to encode a 
chimeric EVLl protein that can be recognized by a com- 
mercially available antibody. A fusion protein may also be 

4Q engineered to contain a cleavage site located between the 
EVLl encoding .sequence and the heterologoas protein 
.sequence, so that EVLl may be cleaved and purified away 
from the heterologous moiety. 

In another embodiment, sequences encoding EVLl may 

45 be synthesized, in whole or in part, using chemical methods 
well known in the art. (See, e.g., Canithers, M. H. et al. 
(1980) Nucl. Acids Res. Symp. Ser. 215-223, and Horn, T. 
et al. (1980) Nucl. Acids Res. Symp. Ser. 225-232.) 
Alternatively, the protein itself may be produced using 

50 chemical methods to synthesize the amino acid sequence of 
EVLl, or a fragment thereof. For example, peptide synthesis 
can be performed using various solid-phase techniques. 
(See, e.g., Roberge, J. Y. et al. (1995) Science 269:202-204.) 
Automated synthesis may be achieved using the ABI 431A 

55 peptide synthesizer (Perkin Elmer). 

The newly synthesized peptide may be substantially puri- 
fied by preparative high performance liquid chromatogra- 
phy. (See, e.g, Chiez, R. M. and F. Z. Regnier (1990) 
Methods Enzyraol. 182:392-421.) 'llie composition of the 

60 synthetic peptides may be confirmed by amino acid analysis 
or by sequencing. (See, e.g., Creighton, T. (1 983) Pwteins, 
Structures and Molecular Properties yV^H Freeman and Co., 
New York, N.Y.) Additionally, the amino acid sequence of 
EVLl, or any part thereof, may be altered during direct 

65 synthesis and/or combined with sequences from other 
proteins, or any part thereof, to produce a variant polypep- 
tide. 
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In order to express a biologically active EVLl, the proteins are soluble and can easily be purified from lysed 

nucleotide sequences encoding EVLl or derivatives thereof cells by adsorption to glutathione-agarose beads followed by 

may be inserted into appropriate expression vector, i.e., a elution in the presence of free glutathione. Proteins made in 

vector which contains the necessary elements for the tran- such systems may be designed to include heparin, thrombin, 

scription and translation of the inserted coding sequence. 5 or factor XA protease cleavage sites so that the cloned 

Methods which are well known to those skilled in the art polypeptide of interest can be released from the GST moiety 

may be used to construct expression vectors containing at will. 

sequences encoding EVLl and appropriate transcriptional In the yeast Saccharomyces cerei^isiae, a number of 

and translational control elements. These methods include in vectors containing constitutive or inducible promoters, such 

vitro recombinant DNA techniques, synthetic techniques, as alpha factor, alcohol oxidase, and PGH, may be used, 

and in vivo genetic recombination. (See, e.g., Sarabrook, J. (See, e.g., Ausubel, supra; and Grant et al. (1987) Methods 

et al. (1989) Molecular Cloning. A Laboratory Manual, Enzymol. 153:516-544.) In cases where plant expression 

Cold Spring Harbor Press, Plainview, N.Y., ch. 4, 8, and vectors are used, the expression of sequences encoding 

16-17; and Ausubel, R M. et al. (1995, and periodic EVLl may be driven by any of a number of promoters. For 

supplements) Current Protocols in Molecular Biology, John example, viral promoters such as the 35S and 1 9S promoters 

Wiley & Sons, New York, N.Y, ch. 9, 13, and 16.) of CaMV may be used alone or in combination with the 

A variety of expression vector/host systems may be omega leader sequence from TMV (Takamatsu, N. (1987) 

utilized to contain and express sequences encoding EVLl. EMBO J. 6:307-311.) Alternatively, plant promoters such as 

These include, but are not limited to, microorganisms such the small subunit of RUBISCO or heat shock promoters may 

as bacteria transformed with recombinant bacteriophage, 20 (^^^' ^ S ^ Coruzzi, G. et al. (1984) EMBO J. 

plasmid, or cosmid DNA expression vectors; yeast trans- 3:1671-1680; Broglie, R. et al. (1984) Science 

formed with yeast expression vectors; insect cell systems 224:838^43; and Winter, J. et al. (1991) Results Probl. Cell 

infected with virus expression vectors (e.g., baculovirus); Differ. 17:85-105.) These constructs can be introduced into 

plant cell systems transformed with virus expression vectors plant cells by direct DNA transformation or pathogen- 

(e.g., cauliflower mosaic virus (CaMV) or tobacco mosaic 25 mediated transfection. Such techniques are described in a 

vims (TMV)) or with bacterial expression vectors (e.g., Ti or number of generally available reviews. (See, e.g., Hobbs, S. 

pBR322 plasmids); or animal cell systems. The invention is or Murry, L. E. in McGraw Hill Yearbook of Science and 

not limited by the host cell employed. Technology (1992) McGraw Hill, New York, N.Y.; pp. 

The "control elements" or "regulatory sequences" are 191-196.) An insect system may also be used to express 
those non-translated regions, e.g., enhancers, promoters, and 30 EVLl. For example, in one such system, Autographa call- 
s' and 3' untranslated regions, of the vector and polynuclc- fornica nuclear polyhedrosis virus (AcNPV) is used as a 
otide sequences cncodii^g EVLl which interact with host vector to express foreign genes in Spodoptera fnigiperda 
cellular proteins to carry out transcription and translation. cells or in Trichoplusia larvae. The sequences encoding 
Such elements may vary in their strength and specificity. EVLl may be cloned into a non-essential region of the virus. 
Depending on the vector system and host utilized, any 35 such as the polyhedrin gene, and placed under control of the 
number of suitable transcription and translation elements, polyhedrin promoter. Successful insertion of sequences 
including constitutive and inducible promoters, may be encoding EVLl will render the polyhedrin gene inactive and 
used. For example, when cloning in bacterial systems, produce recombinant virus lacking coat protein. The recom- 
inducible promoters, e.g., hybrid lacZ promoter of the binant viruses may then be used to infect, for example, S. 
BLUESCRIPT phagemid (Stratagene, La Jolla, Calif.) or 40 frmP^rda cells or Trichoplusia larvae in which EVLl may 
PSPORTl plasmid (GIBCO/BRL), may be used. The bacu- expressed. (See, e.g., Engelhard, E. K. et al. (1994) Proc. 
lovirus polyhedrin promoter may be used in insect cells. Nat. Acad. Sci. 91:3224-3227.) 

Promoters or enhancers derived from the genomes of plant In mammalian host cells, a number of viral-based expres- 

cells (e.g., heat shock, RUBISCO, and storage protein sion systems may be utilized. In cases where an adenovirus 

genes) or from plant viruses (e.g., viral promoters or leader 45 is used as an expression vector, sequences encoding EVLl 

sequences) may be cloned into the vector. In mammalian cell may be ligated into an adenovirus transcription/translation 

systems, promoters from mammalian genes or from mam- complex consisting of the late promoter and tripartite leader 

malian viruses are preferable. If it is necessary to generate sequence. Insertion in a non-essential El or E3 region of the 

a cell line that contains multiple copies of the sequence viral genome may be used to obtain a viable virus which is 

encoding EVLl, vectors based on SV40 or EBV may be 50 capableof expressing EVLl in infected host cells. (See, e.g., 

used with an appropriate selectable marker. Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sd. 

In bacterial systems, a number of expression vectors may 81:3655-3659.) In addition, transcription enhancers, such as 

be selected depending upon the use intended for EVLL For the Rous sarcoma virus (RSV) enhancer, may be used to 

example, when large quantities of EVLl are needed for the increase expression in mammalian host cells, 

induction of antibodies, vectors which direct high level 55 Human artificial chromosomes (HACs) may also be 

expression of fusion proteins that are readily purified may be employed to deliver larger fragments of DNA than can be 

used. Such vectors include, but are not limited to, multi- contained and expressed in a plasmid. HACs of about 6 kb 

functional E. coli cloning and expression vectors such as to 10 Mb are constructed and delivered via conventional 

BLUESCRIPT (Stratagene), in which the sequence encod- delivery methods (liposomes, polycationic amino polymers, 

ing EVI.A may be ligated into the vector in frame with 60 or vesicles) for therapeutic purposes, 

sequences for the amino-terminal Met and the subsequent 7 Specific initiation signals may also be used to achieve 

residues of p-galactosidase so that a hybrid protein is more efficient translation of sequences encoding EVLl. 

produced, and pIN vectors. (See, e.g., Van Heeke, G. and S. Such signals include the ATG initiation codon and adjacent 

M. Schuster (1989) J. Biol. Chem. 264:5503-5509.) pGEX sequences. In cases where sequences encoding EVLl and its 

vectors (Pharmacia Biotech, Uppsala, Sweden) may also be 65 initiation codon and upstream sequences arc inserted into the 

used to express foreign polypeptides as fusion proteins with appropriate expression vector, no additional transcriptional 

glutathione S-transferase (GST). In general, such fusion or translational control signals may be needed. However, in 
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cases where only coding sequence, or a fragment thereof, is Although the presence/absence of marker gene expression 

inserted, exogenous translational control signals including suggests that the gene of interest is also present, the presence 

the ATG initiation codon should be provided. Furthermore, and expression of the gene may need to be confirmed. For 

the initiation codon should be in the correct reading frame to example, if the sequence encoding EVLl is inserted within 

ensure translation of the entire insert. Exogenous transla- 5 a marker gene sequence, transformed cells containing 

tional elements and initiation codons may be of various sequences encoding EVLl can be identified by the absence 

origins, both natural and synthetic. The efficiency of expres- ^f marker gene function. Alternatively, a marker gene can be 

sion may be enhanced by the inclusion of enhancers appro- y^^^ ^g^^j^m with a sequence encoding EVLl under the 

pnate for the particular cell system used. (See e g., Scharf, ^^^^^^j ^ ^. promoter. Expression of the marker gene 

D, et al. (1994) Results Probi. Cell Differ. 20:125-162.) * • j 1 *• ti • j- . 

w . , oui« 1 luui. V. 11 Y^-^ ) 10 response to induction or selection usually indicates 

In addition, a host cell strain may be chosen for its ability ^^^^^^ ^^^^ 

, , ' , • J / expression 01 tne tandem gene as well, 

to modulate expression ot the in.serted sequences or to ^ , ^ ^ . 1 . • . 

process the expressed protein in the desired fashion. Such Alternatively, host cells which contam the nucleic acid 

modifications of the polypeptide include, but are not limited sequence encodmg EVLl and express EVLl may be iden- 

to, acetylation, carboxylation, glycosylation, tified by a variety of procedures known to those of skill in 

phosphorylation, lipidation, and acylation. Post-translational the art. These procedures include, but are not limited to, 

processing which cleaves a "prepro" form of the protein may DNA-DNA or DNA-RNA hybridizations and protein bio- 

also be used to facilitate correct insertion, folding, and/or assay or immunoassay techniques which include membrane, 

function. Different host cells which have specific cellular solution, or chip based technologies for the detection and/or 

machinery and characteristic mechanisms for post- quantification of nucleic acid or protein sequences, 

translational activities (e.g., CI-IO,IIeLa,MDCK,HEK293, 20 r^^^ presence of polynucleotide sequences encoding 

and W138) available from the American Type Culture ^yLl can be detected by DNA-DNA or DNA-RNA hybrid- 

Collection (ATCC, Bethesda, Md ) and may be chosen to -^^^^^^ amplification using probes or fragments or frag- 

ensure the correct modification and processing of the fore^^ «f polynucleotides encoding EVLl Nucleic acid 

^ ^ \ . . i. . ♦ amplification based assays involve the use of oligonuclc- 

For long term, high yield production of recombinani 25 otidcs or oligomers based on the sequences encoding EVLl 

protems, stable expression is preferred. For example, cell . j . * * r * * - • t^kta fima j- 

r. 1.1 i7 » fi • r-irt ^ L . f J to detect transformants containing DNA or RNA encoding 

hnes capable of stably expressing EVLl can be transformed evLx 

using expression vectors which may contain viral origins of 

replication and/or endogenous expression elements and a ^ variety of protocols for detecting and measuring the 

selectable marker gene on the same or on a separate vector. 30 expression of EVLl, using either polyclonal or monoclonal 

Following the introduction of the vector, cells may be antibodies specific for the protem, are known in the art. 

allowed to grow for about 1 to 2 days in enriched media Examples of such techniques include enzyme-linked iramu- 

before being switched to selective media. The purpose of the nasorbent assays (ELISAs), radioimmunoassays (RlAs), 

selectable marker is to confer resistance to selection, and its ^"^^ fluorescence activated cell sorting (FAQS). A two-site, 

presence allows growth and recovery of cells which sue- 15 monoclonal-based immunoassay utilizing monoclonal anti- 

cessfully express the introduced sequences. Resistant clones ^^^^^ reactive to two non-interfering epitopes on EVLl is 

of stably transformed cells may be proliferated using tissue preferred, but a competitive binding assay may be 

culture techniques appropriate to the cell type. employed. These and other assays are well described in the 

Any number of selection systems may be used to recover ^'^V Hampton, R. et al. (1990) Serological 

transformed cell Unes. These include, but are not Umited to, 40 f ^""^f^TJJ . .^fo^'^'f' T.' 

the herpes simplex virus thymidine kinase genes and f^.^'^.n ^ ^ 

adenine phosphoribosyltransferase genes, which can be l^o l-^ll l-lo). 

employed in tk or apr cells, respectively. (See, e.g., Wigler, A wide variety of labels and conjugation techniques are 

M. et al. (1977) Cell 11:223-232; and Lowy, L et al. (1980) known by those skilled in the art and may be used in various 

Cell 2Z:817-^3) Also, antimetabolite, antibiotic, or herbi- 45 nucleic acid and amino acid assays. Means for producing 

cide resistance can be used as the basLs for selection. For labeled hybridization or PCR probes for detecting sequences 

example, dhfr, confers resistance to methotrexate; npt con- related to polynucleotides encoding EVLl include 

fers resistance to the aminoglycosides neomycin and G-418; oligolabeling, nick translation, end-labeling, or PCR ampli- 

and als or pat confer resistance to chlorsulfuron and phos- fication using a labeled nucleotide. Alternatively, the 

phinotricin acetyltransferase, respectively. (See, e.g., 50 sequences encoding EVLl, or any fragments thereof, may 

Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. cloned into a vector for the production of an mRNA 

77:3567-3570; Colbere-Garapin, F. et al (1981) J. Mol. Biol. probe. Such vectors are known in the art, arc commercially 

150:1-14; and Murry, supra.) Additional selectable genes available, and may be used to synthesize RNA probes in 

have been described, e.g., trpB, which allows cells to utilize vitro by addition of an appropriate RNA polymerase such as 

indole in place of tryptophan, or hisD, which allows cells to 55 T7» T3, or SP6 and labeled nucleotides. These procedures 

utilize histinol in place of histidine. (See, e.g., Hartman, S. may be conducted using a variety of commercially available 

C. and R. C. Mulligan (1988) Proc. Natl. Acad. Sci. kits, such as those provided by Pharmacia & Upjohn 

85:8047-«051.) Recently, the use of visible markers has (Kalamazoo, Minn.), Promega (Madison, Wiss.), and U.S. 

gained popularity with such markers as anthocyanins, B Biochemical Corp. (Cleveland, Ohio.). Suitable reporter 

glucuronidase and its substrate GUS, luciferase and its 60 molecules or labels which may be used for ease of detection 

substrate luciferin. Green fluorescent proteins (GFP) include radionuclides, enzymes, fluorescent, 

(Clontech, Palo Alto, Calif.) are also used (See, e.g., Chalfle, chemiluminescent, or chromogenic agents, as well as 

M. et al. (1994) Science 263:802-805.) These markers can substrates, cofactors, inhibitors, magnetic particles, and the 

be used not only to identify transformants, but also to lil^c. 

quantify the amount of transient or stable protein expression 65 Host cells transformed with nucleotide sequences cncod- 

attributablc to a specific vector system. (Sec, e.g., Rhodes, ing EVLl may be cultured under conditions suitable for the 

C. A. et al. (1995) Methods Mol. Biol. 55:121-131.) expression and recovery of the protein from cell culture. The 
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protein produced by a transformed cell may be secreted or 
contained intracellularly depending on the sequence and/or 
the vector used. As will be understood by those of skill in the 
art, expression vectors containing polynucleotides which 
encode EVLl may be designed to contain signal sequences 5 
which direct secretion of EVLl through a prokaryotic or 
eukaryotic cell membrane. Other constructions may be used 
to join sequences encoding EVLl to nucleotide sequences 
encoding a polypeptide domain which will facilitate purifi- 
cation of soluble proteins. Such purification facilitating lo 
domains include, but are not limited to, metal chelating 
peptides such as histidinc-tryptophan modules that allow 
purification on immobilized metals, protein A domains that 
allow purification on immobilized immunoglobulin, and the 
domain utilized in the FLAGS extension/aflfinity purification 15 
system (Immunex Corp., Seattle, Wash.). The inclusion of 
cleavable linker sequences, such as those specific for Factor 
XAor enterokinase (Invitrogen, San Diego, Calif.), between 
the purification domain and the EVLl encoding sequence 
may be used to facilitate purification. One such expression 20 
vector provides for expression of a fusion protein containing 
EVLl and a nucleic acid encoding 6 histidine residues 
preceding a thioredoxin or an enterokinase cleiivage site. 
The histidine residues facilitate purification on immobilized 
metal ion affinity chromatography. (IMAC) (See, e.g., 25 
Porath, J. et al. (1992) Prot. Exp. Purif. 3: 263-28L) The 
enterokinase cleavage site provides a means for purifying 
EVLl from the fusion protein. (See, e.g., Kroll, D. J. et al. 
(1993) DNA CeU Biol. 12:441^53.) Fragments of EVLl 
may be produced not only by recombinant production, but 30 
also by direct peptide synthesis using solid-phase tech- 
niques. (See, e.g., Creighton,T.E. (1984) Protein: Structures 
and Molecular Properties, pp. 55-60, W. H. Freeman and 
Co., New York, N.Y.) Protein synthesis may be performed 
by manual techniques or by automation. Automated synthe- 35 
sis may be achieved, for example, using the Applied Bio- 
systems 431 A peptide synthesizer (Perkin Elmer). Various 
fragments of EVLl may be synthesized separately and then 
combined to produce the full length molecule. 

Therapeutics 

Chemical and structural homology exists among EVLl, 
mouse enaA^ASP-like protein (GI 1644453), and human 
VASP (GI 624964). In addition, EVLl is expressed in cancer 
and the immune response; in gastrointestinal, 45 
cardiovascular, neural, and developmental tissue; and in 
prostate, breast, ovary, and uterus tissue. Therefore, EVLl 
appears to play a role in reproductive, immunological, 
vesicle trafiScking, nervous system, developmental, and neo- 
plastic disorders. 50 

Therefore, in one embodiment, EVLl or a fragment or 
derivative thereof may be administered to a subject to treat 
or prevent a reproductive di.sorder. Such reproductive dis- 
orders can include, but are not limited to, disorders of 
prolactin production; infertility, including tubal disease, 55 
ovulatory defects, and endometriosis; disruptioas of the 
estrous cycle, disruptions of the menstrual cycle, polycystic 
ovary syndrome, ovarian hyperstimulation syndrome, 
endometrial and ovarian tumors, autoimmune disorders, 
ectopic pregnancy, and teratogenesis; cancer of the breast, so 
fibrocystic breast disease, and galactorrhea; disruptions of 
spermatogenesis, abnormal sperm physiology, cancer of the 
testis, cancer of the prostate, benign prostatic hyperplasia, 
and prostatitis, carcinoma of the male breast and gyneco- 
mastia. 65 

In another embodiment, a vector capable of expressing 
EVLl or a firagment or derivative thereof may be adminis- 
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tered to a subject to treat or prevent a reproductive disorder 
including, but not limited to, those described above. 

In a further embodiment, a pharmaceutical composition 
comprising a substantially purified EVLl in conjunction 
with a suitable pharmaceutical carrier may be administered 
to a subject to treat or prevent a reproductive disorder 
including, but not limited to, those provided above. 

In still another embodiment, an agonist which modulates 
the activity of EVLl may be administered lo a subject to 
treat or prevent a reproductive disorder including, but not 
limited to, those listed above. 

In another embodiment, EVLl or a fragment or derivative 
thereof may be administered to a subject to treat or prevent 
an immunological disorder. Such immunological disorders 
can include, but are not limited to, AIDS, Addison *s disease, 
adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, 
bronchitis, cholecystitis, contact dermatitis, Crohn's disease, 
atopic dermatitis, dermatomyositis, diabetes mellitus, 
emphysema, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout. Graves* 
disease, Hashimoto's thyroiditis, hypereosinophilia, irritable 
bowel syndrome, lupus erythematosus, multiple sclerosis, 
myasthenia gravis, myocardial or pericardial inflammation, 
osteoarthritis, osteoporosis, pancreatitis, polymyositis, rheu- 
matoid arthritis, scleroderma, Sjogren's syndrome, systemic 
anaphylaxis, systemic lupus erythematosus, systemic 
sclerosis, ulcerative colitis, Werner syndrome, and compli- 
cations of cancer, hemodialysis, and extracorporeal circula- 
tion; viral, bacterial, fungal, parasitic, protozoal, and helm- 
inthic infections; and trauma. 

In another embodiment, a vector capable of expressing 
EVLl or a fragment or derivative thereof may be adminis- 
tered to a subject to treat or prevent an immunological 
disorder including, but not limited to, those described above. 

In a further embodiment, a pharmaceutical composition 
compri.sing a substantially purified EVLl in conjunction 
with a suitable pharmaceutical carrier may be administered 
to a subject to treat or prevent an immunological disorder 
including, but not Hmited to, those provided above. 

In still another embodiment, an agonist which modulates 
the activity of EVLl may be administered to a subject to 
treat or prevent an immunological disorder including, but 
not limited lo, those listed above. 

In another embodiment, EVLl or a fragment or derivative 
thereof may be administered to a subject to treat or prevent 
a vesicle trafficking disorder. Such vesicle trafficking disor- 
ders can include, but are not limited to, cystic fibrosis, 
glucose-galactose malabsorption syndrome, 
hypercholesterolemia, diabetes mellitus, diabetes insipidus, 
hyper- and hypoglycemia. Grave's disease, goiter, Cush- 
ing's disease, and Addison's disease; gastrointestinal disor- 
ders including ulcerative colitis, gastric and duodenal ulcers; 
other conditions associated with abnormal vesicle trafficking 
including AIDS; allergies including hay fever, asthma, and 
urticaria (hives); autoimmune hemolytic anemia; prolifera- 
tive glomerulonephritis; inflammatory bowel disease; mul- 
tiple sclerosis; myasthenia gravis; rheumatoid and osteoar- 
thritis; scleroderma; Chediak-Higashi and Sjogren's 
syndromes; systemic lupus erythematosus; toxic shock syn- 
drome; traumatic tissue damage; and viral, bacterial, fungal, 
helminth, and protozoal infections. 

In another embodiment, a vector capable of expressing 
EVLl or a fragment or derivative thereof may be adminis- 
tered to a subject to treat or prevent a vesicle trafficking 
disorder including, but not limited to, those described above. 
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In a further embodiment, a pharmaceutical composition In a further embodiment, an antagonist of EVLl may be 

comprising a substantially purified EVLl in conjunction administered to a subject to treat or prevent a neoplastic 

with a suitable pharmaceutical carrier may be administered disorder Such a neoplastic disorder may include, but is not 

to a subject to treat or prevent a vesicle trafiScking disorder limited to, adenocarcinoma, leukemia, lymphoma, 
including, but not limited to, those provided above. 5 melanoma, myeloma, sarcoma, teratocarcinoma, and, in 

In still another embodiment, an agonist which modulates Particular cancers of the adrenal gland bladder, bone, bone 

the activity of EVLl may be administered to a subject to ''^l"' ^'^^ ^^^^'^ ^^^f ^» S^" 

^ , ^ ^ . , , or t ' I- I ij- 1. tromtestinal tract, heart, kidney, liver, lung, muscle, ovary, 

trea or prevent a vesicle ^afficking disorder includmg, but parathyroid, penis, prostate, salivary glands, skin, 

not limited to, those listed above. ^^^^^^^ ^^^^^^ ^^^^^^ ^^^^^-^^ .^^^ ^^^^^ ^p^^^^ .^^ 

In another embodiment, EVLl or a fragment or derivative antibody which specifically binds EVLl may be used 

thereof may be administered to a subject to treat or prevent directly as an antagonist or indirectly as a targeting or 

a nervous system disorder. vSuch nervous system disorders delivery mechanism for bringing a phanmaceutical agent to 

can include, but are not limited to, akathesia, Alzheimer's cells or tissue which express EVLl. 

disease, amnesia, amyotrophic lateral sclerosis, bipolar in an additional embodiment, a vector expressing the 

disorder, catatonia, cerebral neoplasms, dementia, complement of the polynucleotide encoding EVLl may be 

depression, diabetic neuropathy, Down's syndrome, tardive administered to a subject to treat or prevent a neoplastic 

dyskinesia, dystonias, epilepsy, Huntington's disease, mul- disorder including, but not limited to, those described above, 

tiple sclerosis, neurofibromatosis, Parkinson's disease, para- ^^^^^ embodiments, any of the proteins, antagonists, 

noid psychoses, postherpetic neuralgia, schizophrenia, and antibodies, agonists, complementary sequences, or vectors 

Tourette's disorder; angina, anaphylactic shock, ^j^^ ^^^^^.^^j ^^^^y ^ ^^^.^^^^^^^^ ^^^^^^^^^^ 

arrhythmias, asthma, cardiovascular shock, Cushing's other appropriate therapeutic agents. Selection of the appro- 

syndrome, hypertension, hypoglycemia, myocardial priate agents for use in combination therapy may be made by 

infarction, migrame, and pheochromocytoma ordinary skill in the art, according to conventional 

In another embodiment, a vector capable of expressing pharmaceutical principles. The combination of therapeutic 

EVLl or a fragment or derivative thereof may be adminis- agents may act synergistically to effect the treatment or 

tered to a subject to treat or prevent a nervous system prevention of the various disorders described above. Using 

disorder including, but not limited to, those described above. this approach, one may be able to achieve therapeutic 

In a further embodiment, a pharmaceutical composition efficacy with lower dosages of each agent, thus reducing the 

comprising a substantially purified EVLl in conjunction potential for adverse side effects. 

with a suitable pharmaceutical carrier may be administered An antagonist of EVLl may be produced using methods 

to a subject to treat or prevent a nervous system disorder which arc generally known in the art. In particular, purified 

including, but not limited to, those provided above. evli may be used to produce antibodies or to screen 

In still another embodiment, an agonist which modulates libraries of pharmaceutical agents to identify those which 

the activity of EVLl may be administered to a subject to specifically bind EVLl. Antibodies to EVLl may also be 
treat or prevent a nervous system disorder including, but not ' ' generated using methods that are well known in the art. Such 

limited to, those listed above. antibodies may include, but are not limited to, polyclonal, 

In another embodiment, EVLl or a fragment or derivative monoclonal, chimeric, and single chain antibodies. Fab 

thereof may be administered to a subject to treat or prevent fragments, and fragments produced by a Fab expression 

a developmental disorder. Ilie term "developmental disor- hbrary. Neutralizing antibodies (i.e., those which inhibit 

der" refers lo any disorder aSvSociated with development or dimer formation) are especially preferred for therapeutic 

function of a tissue, organ, or system of a subject (such as use. 

the brain, adrenal gland, kidney, skeletal or reproductive For the production of antibodies, various hosts including 

system). Such developmental disorders can include, but are goats, rabbits, rats, mice, humans, and others may be immu- 
not limited to, renal tubular acidosis, anemia, Cushing's 45 nized by injection with EVLl or with any fragment or 

syndrome, achondroplastic dwarfism, Duchcnnc and Becker oligopeptide thereof which has immunogenic properties, 

muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR Depending on the host species, various adjuvants may be 

syndrome, Smith-Magenis syndrome, myelodysplastic used to increase immunological response. Such adjuvants 

syndrome, hereditary mucoepithelial dysplasia, hereditary include, but are not limited to, Freund's, mineral gels such 
keratodermas, hereditary neuropathies such as Charcot- jq as aluminum hydroxide, and surface active substances such 

Marie- Tooth disease and neurofibromatosis, as lysolecithin, pluronic polyols, polyanions, peptides, oil 

hypothyroidism, hydrocephalus, seizure disorders such as emulsions, KLH, and dinitrophenol. Among adjuvants used 

Syndenham's chorea and cerebral palsy, spinal bifida, and in humans, BCG (bacilli Calmctte-Gucrin) and Corynebac- 

congenital glaucoma, cataract, or sensorineural hearing loss. terium pavum are especially preferable. 

In another embodiment, a vector capable of expressing 55 It is preferred that the oligopeptides, peptides, or frag- 

EVLl or a firagment or derivative thereof may be adminis- ments used to induce antibodies to EVLl have an amino acid 

tered to a subject to treat or prevent a developmental sequence consisting of at least about 5 amino acids, and, 

disorder including, but not limited to, those described above. more preferably, of at least about 10 amino acids. It is also 

In a further embodiment, a pharmaceutical composition preferable that these oligopeptides, peptides, or fragments 
comprising a substantially purified EVLl in conjunction 6n are identical to a portion of the amino acid sequence of the 

with a suitable pharmaceutical carrier may be administered natural protein and contain the entire amino acid sequence of 

to a subject to treat or prevent a developmental disorder a small, naturally occurring molecule. Short stretches of 

including, but not Hmited to, those provided above. EVLl amino acids may be fused with those of another 

In still another embodiment, an agonist which modulates protein, such as KLH, and antibodies to the chimeric mol- 
the activity of EVLl may be administered to a subject to 65 ecule may be produced. 

treat or prevent a developmental disorder including, but not Monoclonal antibodies to EVLl may be prepared using 

limited to, those listed above. any technique which provides for the production of antibody 
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molecules by continuous cell lines in culture. These include, 
but are not limited to, the hybridoma technique, the human 
B-cell hybridoma technique, and the EBV-hybridoma tech- 
nique. (See, e.g., Kohler, G. et al. (1975) Nature 
256:495497; Kozbor, D. et al. (1985) J. Immunol. Methods 
81:31-42; Cole, R. J. et al. (1983) Proc. Natl. Acad. Sci. 
80:2()26-2{)30; and Cole, S. R et al. (1984) Mol. Cell Biol. 
62:109-120.) 

In addition, techniques developed for the production of 
"chimeric antibodies,** such as the splicing of mouse anti- 
body genes to human antibody genes to obtain a molecule 
with appropriate antigen specificity and biological activity, 
can be used. (See, e.g., Morrison, S. L. et al. (1984) Proc. 
Natl. Acad. Sci. 81:6851-6855; Neuberger, M. S. et al. 
(1984) Nature 312:604-608; and Takeda, S. et al. (1985) 
Nature 314:452-454.) Alternatively, techniques described 
for the production of single chain antibodies may be 
adapted, using methods known in the art, to produce EVLl- 
specific single chain antibodies. Antibodies with related 
specificity, but of distinct idiotypic composition, may be 
generated by chain shuffling from random combinatorial 
immunoglobulin libraries. (See, e.g.. Burton D. R. (1991) 
Proc. Natl. Acad. Sci. 88:10134-10137.) 

Antibodies may also be produced by inducing in vivo 
production in the lymphocyte population or by screening 
immunoglobulin Hbraries or panels of highly specific bind- 
ing reagents as disclosed in the literature. (See, e.g., Orlandi, 
R. et al. (1989) Proc. Natl. Acad. Sci. 86: 3833-3837; and 
Winter, G. et al. (1991) Nahire 349:293-299.) 

Antibody fragments which contain specific binding sites 
for EVLl may also be generated. For example, such frag- 
ments include, but are not limited to, F(ab')2 fragments 
produced by pepsin digestion of the antibody molecule and 
Fab fragments generated by reducing the disulfide bridges of 
the F(ab')2 fragments. Alternatively, Fab expression libraries 
may be constructed to allow rapid and easy identification of 
monoclonal Fab fragments with the desired specificity. (See, 
e.g., Huse, W. D. et al. (1989) Science 246:1275-1281.) 

Various immunoassays may be used for screening to 
identify antibodies having the desired specificity. Numerous 
protocols for competitive binding or immunoradio metric 
assays using either polyclonal or monoclonal antibodies 
with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of com- 
plex formation between EVLl and its specific antibody. A 
two-site, monoclonal-based immunoassay utilizing mono- 
clonal antibodies reactive to two non -interfering EVLl 
epitopes is preferred, but a competitive binding assay may 
also be employed. (Maddox, supra.) 

In another embodiment of the invention, the polynucle- 
otides encoding EVLl, or any fragment or complement 
thereof, may be used for therapeutic purposes. In one aspect, 
the complement of the polynucleotide encoding EVLl may 
be used in situations in which it would be desirable to block 
the transcription of the mRNA. In particular, cells may be 
transformed with sequences complementary to polynucle- 
otides encoding EVLl. TTius, complementary molecules or 
fragments may be used to modulate EVLl activity, or to 
achieve regulation of gene function. Such technology is now 
well known in the art, and sense or antisense oligonucle- 
otides or larger fragments can be designed from various 
locations along the coding or control regions of sequences 
encoding EVLl. 

Expression vectors derived from retroviruses, 
adenoviruses, or herpes or vaccinia viruses, or from various 
bacterial plasmids, may be used for delivery of nucleotide 
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sequences to the targeted organ, tissue, or cell population. 
Methods which are well known to those skilled in the art can 
be used to construct vectors which will express nucleic acid 
sequences complementary to the polynucleotides of the gene 
5 encoding EVLl. (See, e.g., Sambrook, supra; and Ausubel, 
supra.) 

Genes encoding EVT..1 can be turned off by transforming 
a cell or tissue with expression vectors which express high 
levels of a polynucleotide, or fragment thereof, encoding 
EVLl. Such constructs may be used to introduce untrans- 
latable sense or antisense sequences into a cell. Even in the 
absence of integration into the DNA, such vectors may 
continue to transcribe RNA molecules until they are disabled 
by endogenous nucleases. Transient expression may last for 
a month or more with a non-replicating vector, and may last 
even longer if appropriate replication elements are part of 
the vector system. 

As mentioned above, modifications of gene expression 
can be obtained by designing complementary sequences or 

2Q antisense molecules (DNA, RNA, or PNA) to the control, 5', 
or regulatory regions of the gene encoding EVLl. Oligo- 
nucleotides derived from the transcription initiation site, 
e.g., between about positions -10 and +10 from the start site, 
are preferred. Similarly, inhibition can be achieved using 

25 triple helix base-pairing methodology. Triple helix pairing is 
useful because it causes inhibition of the ability of the 
double helix to open suOiciently for the binding of 
polymerases, transcription factors, or regulatory molecules. 
Recent therapeutic advances using triplex DNA have been 

3Q described in the literature. (See, e.g.. Gee, J. E. et al. (1994) 
in Huber, B. E. and B. I. Carr, Molecular and Inwmnologic 
Approaches, Futura Publishing Co., Mt. Kisco, N.Y., pp. 
163-177.) A complementary sequence or antisense molecule 
may also be designed to block translation of mRNA by 
preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used 
to catalyze the specific cleavage of RNA. The mechanism of 
ribozymc action involves sequence-specific hybridization of 
the ribozyme molecule to complementary target RNA, fol- 

40 lowed by endonucleolytic cleavage. For example, engi- 
neered hammerhead motif ribozyme molecules may specifi- 
cally and efiSciently catalyze endonucleolytic cleavage of 
sequences encoding EVLl. 
Specific ribozyme cleavage sites within any potential 

45 RNA target arc initially identified by scanning the target 
molecule for ribozyme cleavage sites, including the follow- 
ing sequences: GUA, GUU, and GUC. Once identified, short 
RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene containing the 

50 cleavage site, may be evaluated for secondary structural 
features which may render the oligonucleotide inoperable. 
'ITie suitability of candidate targets may also be evaluated by 
testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

55 Complementary ribonucleic acid molecules and 
ribozymes of the invention may be prepared by any method 
known in the art for the synthesis of nucleic acid molecules. 
'ITiese include techniques for chemically synthesizing oli- 
gonucleotides such as solid phase phosphoramidite chemical 

60 synthesis. Alternatively, RNA molecules may be generated 
by in vitro and in vivo transcription of DNA sequences 
encoding EVLl. Such DNA sequences may be incorporated 
into a wide variety of vectors with suitable RNA polymerase 
promoters such as T7 or SP6. Alternatively, these cDNA 

65 constructs that synthesize complementary RNA, constitu- 
tively or inducibly, can be introduced into cell lines, cells, or 
tissues. 
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RNA molecules may be modified to increase intracellular 
stability and half -life. Possible modifications include, but are 
not limited to, the addition of flanking sequences at the 5' 
and/or 3' ends of the molecule, or the use of phosphorothio- 
ate or 2' 0-methyl rather than phosphodiesterase linkages 5 
within the backbone of the molecule. This concept is inher- 
ent in the production of PNAs and can be extended in all of 
these molecules by the inclusion of nontraditional bases 
such as inosine, queosine, and wybutosine, as well as 
acetyl-, methyl-, thio-, and similarly modified forms of lo 
adenine, cytidine, guanine, thymine, and uridine which are 
not as easily recognized by endogenous cndonucieascs. 

Many methods for introducing vectors into cells or tissues 
are available and equally suitable for use in vivo, in vitro, 
and ex vivo. For ex vivo therapy, vectors may be introduced 15 
into stem cells taken from the patient and clonally propa- 
gated for autologous transplant back into that same patient. 
Delivery by transfection, by liposome injections, or by 
polycationic amino polymers may be achieved using meth- 
ods which are well known in the art. (See, e.g., Goldman, C. 20 
K. et al. (1997) Nature Biotechnology 15:462-466.) 

Any of the therapeutic methods described above may be 
applied to any subject in need of such therapy, including, for 
example, mammals such as dogs, cats, cows, horses, rabbits, 
monkeys, and most preferably, humans. 

An additional embodiment of the invention relates to the 
administration of a pharmaceutical or sterile composition, in 
conjunction with a pharmaceutically acceptable carrier, for 
any of the therapeutic effects discussed above. Such phar- 
maceutical compositions may consist of E VLl , antibodies to 
EVLl, and mimetics, agonists, antagonists, or inhibitors of 
EVLl. The compositions may be administered alone or in 
combination with at least one other agent, such as a stabi- 
lizing compound, which may be administered in any sterile, 
biocompatible pharmaceutical carrier including, but not 
limited to, saline, buffered saline, dextrose, and water. The 
compositions may be administered to a patient alone, or in 
combination with other agents, drugs, or hormones. 

The pharmaceutical compositions utilized in this inven- 
tion may be administered by any number of routes including, 
but not limited to, oral, intravenous, intramuscular, intra- 
arterial, intramedullary, intrathecal, intraventricular, 
transdermal, subcutaneous, intraperitoneal, intranasal, 
enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical 
compositions may contain suitable pharmaceutically- 
acceptable carriers comprising excipients and auxiliaries 
which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. Further 
details on techniques for formulation and administration 
may be found in the latest edition Remington's Pharma- 
ceutical Sciences (Maack Publishing Co., Easton, Pa.). 

Pharmaceutical compositions For oral administration can 
be formulated using pharmaceutically acceptable carriers 55 
well known in the art in dosages suitable for oral adminis- 
tration. Such carriers enable the pharmaceutical composi- 
tions to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, symps, slurries, suspensions, and the like, for 
ingestion by the patient. 60 

Pharmaceutical preparations for oral use can be obtained 
through combining active compounds with solid excipient 
and processing the resultant mixture of granules (optionally, 
after grinding) to obtain tablets or dragee cores« Suitable 
auxiliaries can be added, if desired. Suitable excipients 65 
include carbohydrate or protein fillers, such as sugars, 
including lactose, sucrose, mannitol, and sorbitol; starch 
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from com, wheat, rice, potato, or other plants; cellulose, 
such as methyl cellulose, hydroxypropyhnethyl-cellulose, or 
sodium carboxymethylcellulose; gums, including arabic and 
tragacanth; and proteins, such as gelatin and collagen. If 
desired, disintegrating or solubilizing agents may be added, 
such as the cross-linked polyvinyl pyrrolidone, agar, and 
alginic acid or a salt thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable 
coatings, such as concentrated sugar solutions, which may 
also contain gum arabic, talc, polyvinylpyrrolidone, car- 
bopol gel, polyethylene glycol, and/or titanium dioxide, 
lacquer solutions, and suitable organic solvents or solvent 
mixtures. Dyestufife or pigments may be added to the tablets 
or dragee coatings for product identification or to charac- 
terize the quantity of active compound, i.c., dosage. 

Pharmaceutical preparations which can be used orally 
include push-fit capsules made of gelatin, as well as soft, 
sealed capsules made of gelatin and a coating, such as 
glycerol or sorbitol. Push-fit capsules can contain active 
ingredients mixed with fillers or binders, such as lactose or 
starches, lubricants, such as talc or magnesium stearate, and, 
optionally, stabilizers. In soft capsules, the active com- 
pounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid, or liquid polyethylene glycol with 
or without stabilizers. 

Pharmaceutical formulations suitable for parenteral 
administration may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as 
Hanks' solution, Ringer*s solution, or physiologically buff- 
ered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, 
such as sodium carboxymethyl cellulose, sorbitol, or dext- 
ran. Additionally, suspensions of the active compounds may 
be prepared as appropriate oily injection suspensions. Suit- 
able lipophilic solvents or vehicles include fatty oils, such as 
sesame oil, or synthetic fatty acid esters, such as ethyl oleate, 
triglycerides, or liposomes. Non-lipid polycationic amino 
polymers may also be used for delivery. Optionally, the 
suspension may also contain suitable stabilizers or agents to 
increase the solubility of the compounds and allow for the 
preparation of highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate 
to the particular barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. 

The pharmaceutical compositions of the present invention 
may be manufactured in a manner that is known in the art, 
e.g., by means of conventional mixing, dissolving, 
granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a 
salt and can be formed with many acids, including but not 
limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, 
malic, and succinic acids. Salts tend to be more soluble in 
aqueous or other protonic solvents than are the correspond- 
ing free base forms. In other cases, the preferred preparation 
may be a lyophilized powder which may contain any or all 
of the following: 1 mM to 50 mM histidine, 0.1% to 2% 
sucrose, and 2% to 7% mannitol, at a pH range of 4.5 to 5.5, 
that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, 
they can be placed in an appropriate container and labeled 
for treatment of an indicated condition. For administration 
of EVLl, such labeling would include amount, frequency, 
and method of administration. 

Pharmaceutical compositions suitable for use in the 
invention include compositions wherein the active ingredi- 
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ents are contained in an effective amount to achieve the 
intended purpose. The determination of an effective dose is 
well within the capability of those skilled in the art. 

For any compound, the therapeutically effective dose can 
be estimated initially either in cell culture assays, e.g., of 5 
neoplastic cells or in animal models such as mice, rats, 
rabbits, dogs, or pigs. An animal model may also be used to 
determine the appropriate concentration range and route of 
administration. Such information can then be used to deter- 
mine useful doses and routes for administration in humans. 10 

A therapeutically effective dose refers to that amount of 
active ingredient, for example EVLl or fragments thereof, 
antibodies of EVLl, and agonists, antagonists or inhibitors 
of EVLl, which ameliorates the symptoms or condition. 
Therapeutic efiScacy and toxicity may be determined by 15 
standard pharmaceutical procedures in cell cultures or with 
experimental animals, such as by calculating the ED50 (the 
dose therapeutically effective in 50% of the population) or 
LD50 (the dose lethal to 50% of the population) statistics. 
The dose ratio of therapeutic to toxic effects is the thera- 20 
peutic index, which can be expressed as the ED5OLD50 
ratio. Pharmaceutical compositions which exhibit large 
therapeutic indices are preferred. The data obtained from 
cell culture assays and animal studies are used to formulate 
a range of dosage for human use. The dosage contained in 25 
such compositions is preferably within a range of circulating 
concentrations that includes the ED50 with little or no 
toxicity. The dosage varies within this range depending upon 
the dosage form employed, the sensitivity of the patient, and 
the route of administration. 30 

The exact dosage will be determined by the practitioner, 
in light of factors related to the subject requiring treatment. 
Dosage and administration are adjusted to provide sufficient 
levels of the active moiety or to maintain the desired effect. 
Factors which may be taken into account include the sever- -^^ 
ity of the disease stale, the general health of the subject, the 
age, weight, and gender of the subject, time and frequency 
of administration, drug combination(s), reaction 
sensitivities, and response to therapy. Long-acting pharma- 
ceutical compositions may be administered every 3 to 4 40 
days, every week, or biweekly depending on the half-life 
and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 /<g to 
100,000 /«g, up to a total dose of about 1 gram, depending 
upon the route of administration. Guidance as to particular 
dosages and methods of delivery is provided in the literature 
and generally available to practitioners in the art. Those 
skilled in the art will employ different formulations for 
nucleotides than for proteins or their inhibitors. Similarly, 
delivery of polynucleotides or polypeptides will be specific 
to particular cells, conditions, locations, etc. 

Diagnostics 

In another embodiment, antibodies which specifically 
bind EVLl may be used for the diagnosis of disorders 55 
characterized by expression of EVLl, or in assays to moni- 
tor patients being treated with EVLl or agonists, 
antagonists, or inhibitors of EVLl. Antibodies useful for 
diagnostic purposes may be prepared in the same manner as 
described above for therapeutics. Diagnostic assays for 60 
EVLl include methods which utilize the antibody and a 
label to detect EVLl in human body fluids or in extracts of 
cells or tissues. The antibodies may be used with or without 
modification, and may be labeled by covalent or non- 
covalent attachment of a reporter molecule. A wide variety 65 
of reporter molecules, several of which are described above, 
are known in the art and may be used. 
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A variety of protocols for measuring EVLl, including 
ELISAs, RIAs, and FACS, are known in the art and provide 
a basis for diagnosing altered or abnormal levels of EVLl 
expression. Normal or standard values for EVLl expression 
are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, preferably human, 
with is antibody to EVLl under conditions suitable for 
complex formation The amount of standard complex for- 
mation may be quantitated by various methods, preferably 
by photometric means. Quantities of EVLl expressed in 
subject, samples from biopsied tissues are compared with 
the standard values. Deviation between standard and subject 
values establishes the parameters for diagnosing disease. 

In another embodiment of the invention, the polynucle- 
otides encoding EVLl may be used for diagnostic purposes. 
The polynucleotides which may be used include oligonucle- 
otide sequences, complementary RNA and DNA molecules, 
and PNAs. The polynucleotides may be used to detect and 
quantitate gene expression in biopsied tissues in which 
expression of EVLl may be correlated with disease. 'VhQ 
diagnostic assay may be used to determine absence, 
presence, and excess expression of EVLl, and to monitor 
regulation of EVLl levels during therapeutic intervention. 

In one aspect, hybridization with PGR probes which are 
capable of detecting polynucleotide sequences, including 
genomic sequences, encoding EVLl or closely related mol- 
ecules may be used to identify nucleic acid sequences which 
encode EVLl. The specificity of the probe, whether it is 
made from a highly specific region, e.g., the 5' regulatory 
region, or from a less specific region, e.g., a conserved motif, 
and the stringency of the hybridization or amplification 
(maximal, high, intermediate, or low), will determine 
whether the probe identifies only naturally occurring 
sequences encoding EVIJ, alleles, or related sequences. 

Probes may also be used for the detection of related 
sequences, and should preferably contain at least 50% of the 
nucleotides from any of the EVLl encoding sequences. The 
hybridization probes of the subject invention may be DNA 
or RNA and may be derived from the sequence of SEQ ID 
NO: 2 or from genomic sequences including promoters, 
enhancers, and introns of the EVLl gene. 

Means for producing specific hybridization probes for 
DNAs encoding EVLl include the cloning of polynucleotide 
sequences encoding EVLl or EVLl derivatives into vectors 
for the production of mRNA probes. Such vectors are known 
in the art, are commercially available, and may be used to 
synthesize RNA probes in vitro by means of the addition of 
the appropriate RNA polymerases and the appropriate 
labeled nucleotides. Hybridization probes may be labeled by 
a variety of reporter groups, for example, by radionuclides 
such as ^^p or ^^S, or by enzymatic labels, such as alkaline 
phosphatase coupled to the probe via avidin/biotin coupling 
systems, and the like. 

Polynucleotide sequences encoding EVLl may be used 
for the diagnosis of a disorder associated with expression of 
EVLl. Examples of such a disorder include, but are not 
limited to, a reproductive disorder, such as, disorders of 
prolactin production; infertility, including tubal disease, 
ovulatory defects, and endometriosis; disruptions of the 
eslrous cycle, disruptions of the menstrual cycle, polycystic 
ovary syndrome, ovarian hyperstimulation syndrome, 
endometrial and ovarian tumors, autoimmune disorders, 
ectopic pregnancy, and teratogenesis; cancer of the breast, 
fibrocystic breast disease, and galactorrhea; disruptions of 
spenmatogenesis, abnormal sperm physiology, cancer of the 
testis, cancer of the prostate, benign prostatic hyperplasia, 
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and prostatitis, carcinoma of the male breast and gyneco- 
mastia; an immunological disorder, such as, AIDS, Addi- 
son's disease, adult respiratpry distress syndrome, allergies, 
ankylosing spondylitis, amyloidosis, anemia, asthma, 
atherosclerosis, autoimmune hemolytic anemia, autoim- 5 
mune thyroiditis, bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, 
dermatomyositis, diabetes mellitus, emphysema, erythema 
nodosum, atrophic gastritis, glomerulonephritis, Goodpas- 
ture's syndrome, gout, Graves' disease, Hashimoto's 10 
thyroiditis, hyiwreosinophilia, irritable bowel syndrome, 
lupus erythematosus, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, osteoarthritis, 
osteoporosis, pancreatitis, polymyositis, rheumatoid 
arthritis, scieroderma, Sjogren's S3aidrome, systemic 15 
anaphylaxis, systemic lupus erythematosus, systemic 
sclerosis, ulcerative colitis, Werner syndrome, and compli- 
cations oF cancer, hemodialysis, and extracorporeal circula- 
tion; viral, bacterial, fungal, parasitic, protozoal, and helm- 
inthic infections; and trauma; a vesicle trafficking disorder, 20 
such as, cystic fibrosis, glucose-galactose malabsorption 
syndrome, hypercholesterolemia, diabetes mellitus, diabetes 
insipidus, hyper- and hypoglycemia. Grave's disease, goiter, 
Cushing's disease, and Addison's disease; gastrointestinal 
disorders including ulcerative colitis, gastric and duodenal 25 
ulcers; other conditions associated with abnormal vesicle 
traflScking including AIDS; allergies including hay fever, 
asthma, and urticaria (hives); autoimmune hemolytic ane- 
mia; proliferative glomerulonephritis; inflammatory bowel 
disease; multiple sclerosis; myasthenia gravis; rheumatoid 30 
and osteoarthritis; scleroderma; Chediak-Higashi and 
vSjogren's syndromes; systemic lupus erythematosus; toxic 
shock syndrome; traumatic tissue damage; and viral, 
bacterial, fungal, helminth, and protozoal infections; a ner- 
vous system disorder, such as, akathesia, Alzheimer's 35 
disease, amnesia, amyotrophic lateral sclerosis, bipolar 
disorder, catatonia, cerebral neoplasms, dementia, 
depression, diabetic neuropathy, Down's syndrome, tardive 
dyskinesia, dystonias, epilepsy, Huntington's disease, mul- 
tiple sclerosis, neurofibromatosis, Parkinson's disease, para- 40 
noid psychoses, postherpetic neuralgia, schizophrenia, and 
Tourette's disorder; angina, anaphylactic shock, 
arrhythmias, asthma, cardiovascular shock, Cushing's 
syndrome, hypertension, hypoglycemia, myocardial 
infarction, migraine, and pheochromocytoma; a develop- 45 
mental disorder, such as, renal tubular acidosis, anemia, 
Cushing's syndrome, achondroplastic dwarfism, Duchenne 
and Becker muscular dystrophy, epilepsy, gonadal 
dysgenesis, WAGR syndrome, Smith-Magcnis syndrome, 
myelodysplastic syndrome, hereditary mucocpithclial 50 
dysplasia, hereditary keratodermas, hereditary neuropathies 
such as Charcot-Marie-Tooth disease and neurofibromatosis, 
hypothyroidism, hydrocephalus, seizure disorders such as 
Syndenham's chorea and cerebral palsy, spinal bifida, and 
congenital glaucoma, cataract, or sensorineural hearing loss, 55 
and a neoplastic disorder, such as, adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, 
teratocarcinoma, and, in particular, cancers of the adrenal 
gland, bladder, bone, bone marrow, brain, breast, cervix, gaU 
bladder, ganglia, gastrointestinal tract, heart, kidney, liver, 6n 
lung, muscle, ovary, pancreas, parathyroid, penis, prostate, 
salivary glands, skin, spleen, testis, thymus, thyroid, and 
uterus. The polynucleotide sequences encoding EVLl may 
be used in Southern or northern analysis, dot blot, or other 
membrane-based technologies; in PCR technologies; in 65 
dipstick, pin, and EUSA assays; and in microarrays utilizing 
fluids or tissues from patients to detect altered EVLl expres- 



sion. Such qualitative or quantitative methods are well 
known in the art. 

In a particular aspect, the nucleotide sequences encoding 
EVLl may be useful in assays that detect the presence of 
associated disorders, particularly those mentioned above. 
The nucleotide sequences encoding EVLl may be labeled 
by standard methods and added to a fluid or tissue sample 
from a patient under conditions suitable for the formation of 
hybridization complexes. After a suitable incubation period, 
the sample is washed and the signal is quantitated and 
compared with a standard value. If the amount of signal in 
the patient sample is significantly altered in comparison to a 
control sample then the presence of altered levels of nucle- 
otide sequences encoding EVLl in the sample indicates the 
presence of the associated disorder. Such assays may also be 
used to evaluate the efficacy of a particular therapeutic 
treatment regimen in animal studies, in clinical trials, or to 
monitor the treatment of an individual patient. 

In order to provide a basLs for the diagnosis of a disorder 
assticiated with expression of EVLl , a normal or standard 
profile for expression is established. This may be accom- 
plished by combining body fluids or cell extracts taken from 
normal subjects, either animal or human, with a sequence, or 
a fragment thereof, encoding EVLl, under conditions suit- 
able for hybridization or amplification. Standard hybridiza- 
tion may be quantified by comparing the values obtained 
from normal subjects with values from an experiment in 
which a known amount of a substantially purified polynucle- 
otide is used. Standard values obtained in this manner may 
be compared with values obtained from samples from 
patients who are symptomatic for a disorder. Deviation from 
standard values is used to establish the presence of a 
disorder. 

Once the presence of a disorder is established and a 
treatment protocol is initiated, hybridization assays may be 
repeated on a regular basis to determine if the level of 
expression in the patient begins to approximate that which is 
observed in the normal subject. The results obtained from 
successive assays may be used to show the efficacy of 
treatment over a period ranging from several days to months. 

With respect to cancer, the presence of a relatively high 
amount of transcript in biopsied tissue from an individual 
may indicate a predisposition for the development of the 
disease, or may provide a means for detecting the disease 
prior to the appearance of actual clinical symptoms. A more 
definitive diagnosis of this type may allow health profes- 
sionals to employ preventative measures or aggressive treat- 
ment earlier thereby preventing the development or further 
progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed 
from the sequences encoding EVLl may involve the use of 
PCR. These oligomers may be chemically synthesized, 
generated enzymatically, or produced in vitro. Oligomers 
will preferably contain a fragment of a polynucleotide 
encoding EVLl, or a fragment of a polynucleotide comple- 
mentary to the polynucleotide encoding EV1.J, and will be 
employed under optimized conditions for identification of a 
specific gene or condition. Oligomers may also be employed 
under less stringent conditions for detection or quantitation 
of closely related DNA or RNA sequences. 

Methods which may also be used to quantitate the expres- 
sion of EVLl include radiolabeling or biotinylating 
nucleotides, coamplification of a control nucleic acid, and 
interpolating results from standard curves. (See, e.g., Melby, 
R C. ct al. (1993) J. Immunol. Methods 159:235-244; and 
Duplaa, C. et al. (1993) Anal. Biochem 212:229-236.) The 
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Speed of quantitation of multiple samples may be acceler- differences in the chromosomal location due to 

ated by running the assay in an ELISA format where the translocation, inversion, etc., among normal, carrier, or 

oligomer of interest is presented in various dilutions and a affected individuals. 

spectrophotometric or colorimetric response gives rapid in another embodiment of the invention, EVLl, its cata- 
quantitation. 5 lytic or immunogenic fragments, or oligopeptides thereof 

In further embodimenLs, oligonucleotides or longer frag- can be used for screening libraries of compounds in any of 

ments derived from any of the polynucleotide sequences a variety of drug screening techniques. The fragment 

described herein may be used as targets in a microarray. The employed in such screening may be free in solution, affixed 

microarray can be used to monitor the expression level of to a solid support, borne on a cell surface, or located 

large numbers of genes simultaneously and to identify intracellularly. The formation of binding complexes between 

genetic variants, mutations, and polymorphisms. This infor- EVLl and the agent being tested may be measured, 

mation may be used to determine gene function, to under- Another technique for drug screening provides for high 

stand the genetic basis of a disorder, to diagnose a disorder, throughput screening of compounds having suitable binding 

and to develop and monitor the activities of therapeutic affinity to the protein of interest. (See, e.g., Geysen, et al. 
agents. ^5 (^934) p^j application W084/03564.) In this method, large 

Microarrays may be prepared, used, and analyzed using numbers of different small test compounds are synthesized 

methods known in the art. (See, e.g., Brennan, T. M. et al. on a solid substrate, such as plastic pins or some other 

(1995) U.S. Pat. No. 5,474,796; Schena, M. et al. (1996) surface. The test compounds are reacted with EVLl, or 

Proc. Natl. Acad. vSci. 93:10614-10619; Baldcschweiler et fragments thereof, and washed. Bound EVLl is then 

al. (1995) per application W095/251116; Shalon, D. et al. detected by methods well known in the art. Purified EVLl 

(1995) PCT application W095/35505; Heller, R. A. et al. can also be coated directly onto plates for use in the 

(1997) Proc. Natl. Acad. Sci. 94:2150-2155; and Heller, M. aforementioned drug screening techniques. Alternatively, 

J. et al. (1997) U.S. Pat. No. 5,605,662.) non-neutralizing antibodies can be used to capture the 

In another embodiment of the invention, nucleic acid ^5 P^V^^^^ immobilize it on a solid support, 

sequences encoding EVLl may be used to generate hybrid- In another embodiment, one may use competitive drug 

ization probes useful in mapping the naturally occurring screening assays in which neutralizing antibodies capable of 

genomic sequence. The sequences may be mapped to a binding EVLl specifically compete with a test compound 

particular chromosome, to a specific region of a for binding EVLl. In this manner, antibodies can be used to 

chromosome, or to artificial chromosome constructions, e.g., detect the presence of any peptide which shares one or more 

human artificial chromosomes (HACs), yeast artificial chro- antigenic determinants with EVLl. 

mosomes (YACs), bacterial artificial chromosomes (BACs), in additional embodiments, the nucleotide sequences 

bacterial PI constructions, or .single chromosome cDNA which encode EVLl may be used in any molecular biology 

libraries. (See, e.g., Price, CM. (1993) Blood Rev. techniques that have yet to be developed, provided the new 

7:127-134; and Trask, B. J. (1991) Trends Genet. techniques rely on properties of nucleotide sequences that 

7:149-154.) are currently known, including, but not limited to, such 

Fluorescent in situ hybridization (FISH) may be corre- properties as the triplet genetic code and specific base pair 

laicd with other physical chromosome mapping techniques interactions. 

and genetic map data. (Sec, e.g., Heinz- Ulrich, ct al. (1995) The examples below are provided to iUusirate the subject 
in Meyers, R. A. (ed.) Molecular Biology and 40 invention and are not included for the purpose of limiting the 

Biotechnology, VCH Publishers New York, N.Y., pp. invention. 
965-968.) Examples of genetic map data can be found in 

various scientific journals or at the Online Mendelian Inher- EXAMPLES 

itance in Man (OMIM) site. Correlation between the loca- , uc a ^Kir^T^r^^ i^ma t u r> * 

^ J. r-Trr^ L • • L i L HEAONOT03 cDNA Librafv ConstructiOH 
tion of the gene encodmg EVLl on a physical chromosomal 45 

map and a specific disorder, or a predisposition to a specific The HEAONOT03 cDNA library was constructed from 

disorder, may help define the region of DNA associated with normal aorta tissue obtained from a 27-year-old Caucasian 

' that disorder. The nucleotide sequences of the invention may female who died from intracranial bleeding, 

be used to detect differences in gene sequences among The frozen tissue was homogenized and lysed TRIZOL 
normal, carrier, and affected individuals. 50 reagent (1 g tissue/10 ml TRIZOL, Catalog #10296-028; 

In situ hybridization of chromosomal preparations and GIBCO-BRL), a monophasic solution of phenol and guani- 

physical mapping techniques, such as linkage analysis using dine sotliiocyanate, using a Polytron Fl-SOOO homogenizer 

established chromosomal markers, may be used for extend- (Brinkmann lastrumenLs, Westbury, N.Y). After a brief 

ing genetic maps. Often the placement of a gene on the incubation on ice, chloroform was added (1:5 v/v) and the 
chromosome of another mammalian species, such as mouse, 55 lysate was centrifuged. The upper aqueous layer was 

may reveal associated markers even if the number or arm of removed to a fresh tube and the RNA precipitated with 

a particular human chromosome is not known. New isopropanol, resuspended in DEPC-treated water, and 

sequences can be assigned to chromosomal arms by physical treated with DNAse for 25 min at 37*" C. The RNA was 

mapping, lliis provides valuable information to investiga- extracted and precipitated as described before. The mRNA 
tors searching for disease genes using positional cloning or 60 was then isolated using the OLIGOTEX (QIAGEN, Inc., 

other gene discovery techniques. Once the disease or syn- Chatsworth, Calif.) and used to construct the cDNA library, 

drome has been crudely localized by genetic linkage to a The mRNA was handled according to the recommended 

particular genomic region, e.g., AT to llq22-23, any protocols in the SUPERSCRIPT plasmid system (Catalog 

sequences mapping to that area may represent associated or #18248-013, GIBCO-BRL). cDNA synthesis was initiated 
regulatory genes for further investigation. (Sec, e.g., Gatti, 65 with a Not I-oligo d(T) primer. Double -stranded cDNAwas 

R. A. et al. (1988) Nature 336:577-580.) The nucleotide blunted, ligatcd to EcoR I adaptors, digested with Not I, 

sequence of the subject invention may also be used to detect fractionated on a SEPHAROSE CL4B column (Catalog 
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#275105-01: Pharmacia), and those cDNAs exceeding 400 
bp were ligated into the Not I-and EcoR I sites of the pINCY 
1 vector (Incyte). The plasm id pINCY 1 was subsequently 
transformed into DH5a competent cells (Catalog 
#18258-012; GIBco-BRL). 5 

II. Isolation and Sequencing of cDNA Clones 

Plasmid DNA was released from the cells and purified 
using the REAL PREP 96 plasmid kit (Catalog #26173; 
QIAGEN, Inc.). The recommended protocol was employed 
except for the following changes: 1) the bacteria were 
cultured in 1 ml of sterile Terrific Broth (Catalog #22711, 
GIBCO-BRL) with carbenicillin at 25 mg/L and glycerol at 
0.4%; 2) after inoculation, the cultures were incubated for 19 
hours and at the end of incubation, the cells were lysed with 
0.3 ml of lysis buffer; and 3) following isopropanol 
precipitation, the plasmid DNA pellet was resuspcnded in 
0.1 ml of distilled water. After the last step in the protocol, 
samples were transferred to a 96-well block for storage at 4** 

The cDNAs were sequenced by the method of Sanger et 
al. (1975, J. Mol. Biol. 94:4410, using » MICROLAB 2200 
(Hamilton, Reno, Nev.) in combination with Peltier thermal 
cyclers (PTC200 from MJ Research, Watertown, Mass.) and 25 
Applied Biosystems 377 DNA sequencing systems, and the 
reading frame was determined. 

III. Homology Searching of cDNA Clones and 

Their Deduced Proteins 

The nucleotide sequences and/or amino acid sequences of 
the Sequence Listing were used to query sequences in the 
GenBank, wSwissProt, BLOCKS, and Pima II databases. 
These databases, which contain previously identified and 
annotated sequences, were searched for regions of homol- 
ogy using BLAST (Basic Local Alignment Search Tool). 
(Sec, e.g., Altschul, S. R (1993) J. Mol. Evol. 36:290-300; 
and Altschul ct al. (1990) J. Mol. Biol. 215:403-410.) 

BLAST produced alignments of both nucleotide and 
amino acid sequences to determine sequence similarity. 
Because of the local nature of the alignments, BLAST was 
especially useful in determining exact matches or in iden- 
tifying homologs which may be of prokaryotic (bacterial) or 
eukaryotic (animal, fungal, or plant) origin. Other algo- 
rithms could have been used when dealing with primary 
sequence patterns and secondary structure gap penalties. 
(See, e.g., wSmith, T et al. (1992) Protein Engineering 
5:35-51.) The sequences disclosed in this application have 
lengths of at least 49 nucleotides and have no more than 12% 

St) 

uncalled bases (where N is recorded rather than A, C, G, or ' 
T). 

The BLAST approach searched for matches between a 
query sequence and a database sequence. BLAST evaluated 
the statistical significance of any matches found, and 
reported only those matches that satisfy the user-selected 
threshold of significance. In this application, threshold was 
set at 10"^^ for nucleotides and 10~® for peptides. 

Incyte nucleotide sequences were searched against the 
GenBank databases for primate PF-0471 US (pri), rodent 
(rod), and other mammalian .sequences (mam), produced 
amino acid sequences from the same clones were then 
searched against GenBank functional protein databases, 
mammalian (mamp), vertebrate (vrtp), and cukaryote 
(cukp), for homology. 65 

Additionally, sequences identified from cDNA libraries 
may be analyzed to identify those gene sequences encoding 
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conserved protein motifs using an appropriate analysis 
program, e.g., the Block 2 Bioanalysis Program (Incyte, 
Palo Alto, Calif.). This motif analysis program, based on 
sequence information contained in the Swiss-Prol-Database 
and PROSITE, is a method of determining the function of 
uncharacterized proteins translated from genomic or cDNA 
sequences. (See, e.g., Bairoch, A. et al. (1997) Nucleic Acids 
Res. 25:217-221; and Attwood, T. K. et al (1997) J. Chem. 
Inf. Compul. Sci. 37:417424.) PROSITE may be used to 
identify common functional or structural domains in diver- 
gent proteins. The method is based on weight matrices. 
Motifs identified by this method are then calibrated against 
the SWISS-PROT database in order to obtain a measure of 
the chance distribution of the matches. 

In another alternative, Hidden Markov models (HMMs) 
may be used to find protein domains, each defined by a 
dataset of proteins known to have a common biological 
function. (See, e.g., Pearson, W. R. and D. J, Lipman (1988) 
Proc. Natl. Acad. Sci. 85:2444-2448; and Smith, T F. and 
M. S. Waterman (1981) J. Mol. Biol. 147:195-197.) HMMs 
were initially developed to examine speech recognition 
patterns, but are now being used in a biological context to 
analyze protein and nucleic acid sequences as well as to 
model protein structure. (See, e.g., Krogh, A. el al. (1994) J. 
Mol. Biol. 235:1501-1531; and CoUin, M. el al. (1993) 
Protein Sci. 2:305-314.) HMMs have a formal probabilistic 
basis and use position-specific scores for amino acids or 
nucleotides. The algorithm continues to incorporate infor- 
mation from newly identified sequences to increase its motif 
analysis capabilities. 

IV. Northern Analysis 

Northern analysis is a laboratory technique used to detect 
the presence of a transcript of a gene and involves the 
hybridization of a labeled nucleotide sequence to a mem- 
brane on which RNAs from a particular cell type or tissue 
have been bound. (See, e.g., Sambrook, supra, ch. 7; and 
Ausubel, F. M. et al. supra, ch. 4 and 16.) 

Analogous computer techniques applying BLAST are 
used to search for identical or related molecules in nucle- 
otide databases such as GENBANK or LIFESEQ database 
(Incyte Pharmaceuticals). This analysis is much faster than 
multiple membrane -based hybridizations. In addition, the 
sensitivity of the computer search can be modified to deter- 
mine whether any particular match is categorized as exact or 
homologous. 

The basis of the search is the product score, which is 
defined as: 

% sequence identity x% maximtun BLAST score 
100 

The product score lakes into account both the degree of 
similarity between two sequences and the length of the 
sequence match. For example, with a product score of 40, 
the match will be exact within a 1% to 2% error, and, with 
a product score of 70, the match will be exact. Homologous 
molecules are usually identified by selecting those which 
show product scores between 15 and 40, although lower 
scores may identify related molecules. 

The results of northern analysis are reported as a list of 
libraries in which the transcript encoding EVLl occurs. 
Abundance and percent abundance are also reported. Abun- 
dance directly reflects the number of times a particular 
transcript is represented in a cDNA library, and percent 
abundance is abundance divided by the total number of 
sequences examined in the cDNA library. 
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Step i 94** C. for 1 mm (initial deaaturation) 

Step 2 65* C. for 1 min 

Step 3 68* C. for 6 min 

Step 4 94** C. for 15 sec 

Step 5 65* C. for 1 min 

Step 6 68* C. for 7 min 

Step 7 Repeat steps 4 through 6 for an additional 15 cycles 

Step 8 94* C. for 15 sec 

Step 9 65* C. for 1 min 

Step 10 68* C. for 7:15 min 

Step 11 Repeal steps 8 Ihrou^ 10 for an additional 12 cycles 

Step 12 72* C. for 8 min 

Stq> 13 4'* C (and holding) 



A 5 .«! to 10 //] aliquot of the reaction mixture was 
analyzed by electrophoresis on a low cx)ncentration (about 
0.6% to 0.8%) agarose mini-gel to determine which reac- 
tions were successful in extending the sequence. Bands 
thought to contain the largest products were excised from the 
gel, purified using QIAQUICK (QIAGEN Inc., Chatsworth, 
Calif.), and trimmed of overhangs using Klenow enzyme to 
facilitate re ligation and cloning. 

After ethanol precipitation, the products were redissolved 
in 13 fd of ligation buffer, 1 //I T4-DNA ligase (15 units) and 
1 fi\ T4 polynucleotide kinase were added, and the mixture 
was incubated at room temperature for 2 to 3 hours, or 
overnight at 16° C. Competent E. coU cells (in 40 //I of 
appropriate media) were transformed with 3 /d of ligation 
mixture and cultured in 80 jn\ of vSOC medium. (vSee, e.g., 
Sambrook, supra. Appendix A, p. 2.) After incubation for 
one hour at 37° C, the E. coli mixture was plated on Luria 
Bertani (LB) agar (See, e.g., Sambrook, supra. Appendix A, 
p. 1) containing 2xCarb. The following day, several colonies 
were randomly picked from each plate and cultured in 150 
/d of liquid LB/2xCarb medium placed in an individual well 
of an appropriate commercially-available sterile 96-well 
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V Extension of EVLl Encoding Polynucleotides 

The nucleic acid sequence of Incyte Clone 3089412 was 
used to design oligonucleotide primers for extending a 
partial nucleotide sequence to full length. One primer was 
synthesized to initiate extension of an antisense 
polynucleotide, and the other was synthesized to initiate 
extension of a sense polynucleotide. Primers were used to 
facilitate the extension of the known sequence "outward" 
generating amplicons containing new unknown nucleotide 
sequence for the region of interest. The initial primers were 
designed from the cDNA using OLIGO 4.06 software 
(National Biosciences, Plymouth, Minn.), or another appro- 
priate program, to be about 22 to 30 nucleotides in length, 
to have a GC content of about 50% or more, and to anneal 
to the target sequence at temperatures of about 68° C. to 
about 72° C. Any stretch of nucleotides which would result 
in hairpin structures and primer-primer dimerizations was 
avoided. 

Selected human cDNA libraries (GIBCO/BRL) were used 20 
to extend the sequence. If more than one extension is 
necessary or desired, additional sets of primers are designed 
to further extend the known region. 

High fidelity amplification was obtained by following the 
instructions for the XL-PCR kit (Perkin Elmer) and thor- 25 
oughly mixing the enzyme and reaction mix. PCR was 
performed using the Peltier thermal cycler (PTC200; M. J. 
Research, Watertown, Mass.), beginning with 40 pmol of 
each primer and the recommended concentrations of all 
other components of the kit, with the following parameters: 30 



50 



microliter plate. The following day, 5 /d of each overnight 
culture was transferred into a non-sterile 96-well plate and, 
after dilution 1:10 with water, 5 /d from each sample was 
tranferred into a PCR array. 

For PCR amplification, 18 /d of concentrated PCR reac- 
tion mix (3.3x) containing 4 units of rTth DNA polymerase, 
a vector primer, and one or both of the gene specific primers 
used for the extension reaction were added to each well. 
Amplification was performed using the following condi- 
tions: 



Step 1 


94"^ C. for 60 sec 


Step 2 


94° C. for 20 sec 


Step 3 


55° C for 30 sec 


Step 4 


72° C. for 90 sec 


Step 5 


Repeat steps 2 through 4 for an additional 29 cycles 


Step 6 


72° C. for 180 sec 


Step 7 


4" C. (and holding) 



Aliquots of the PCR reactions were run on agarose gels 
together with molecular weight markers. The sizes of the 
PCR products were compared to the original partial cDNAs, 
and appropriate clones were selected, ligated into plasmid, 
and sequenced. 

In like manner, the nucleotide sequence of SEQ ID N0:2 
is used to obtain 5' regulatory sequences using the procedure 
above, oligonucleotides designed for 5' extension, and an 
appropriate genomic library. 

VI. Labeling and Use of Individual Hybridization 
Probes 

Hybridization probes derived from SEQ ID N0:2 are 
employed to screen cDNAs, genomic DNAs, or mRNAs. 
Although the labeling of oligonucleotides, consisting of 
about 20 base pairs, is specifically described, essentially the 
same procedure is used with larger nucleotide fragments. 
Oligonucleotides are designed using state-of-the-art soft- 
4p ware such as OLIGO 4.06 software (National Biosciences) 
and labeled by combining 50 pmol of each oligomer, 250 
/iCi of [y-^'P] adenosine triphosphate (Aniersham, Chicago, 
III ), and 14 polynucleotide kinase (DuPont NEN, Boston, 
Mass.). The labeled oligonucleotides are substantially puri- 
45 fled using a SEPHADEX G-25 superfine resin column 
(Pharmacia & Upjohn, Kalamazoo, Mich.). An aliquot con- 
taining 10' counts per minute of the labeled probe is used in 
a typical membrane-based hybridization analysis of human 
genomic DNA digested with one of the following cndonu- 
clcases: Ase I, Bgl II, Eco RI, Pst I, Xba 1, or Pvu II (DuPont 
NEN, Boston, Mass.). 

The DNA from each digest is fractionated on a 0.7 percent 
agarose gel and transferred to nylon membranes (Nytran 
Plus, Schleicher & Schuell, Durham, N.H.). Hybridization is 
55 carried out for 16 hours at 40° C. To remove nonspecific 
signals, blots are sequentially washed at room temperature 
under increasingly stringent conditions up to O.lxsaline 
sodium citrate and 0.5% sodium dodecyl sulfate. After 
XOMAT AR film (Kodak, Rochester, N.Y.) is exposed to the 
60 blots, hybridization patterns are compared visually. 

VII. Microarrays 

A chemical coupling procedure and an ink jet device can 
be used to synthesize array elements on the surface of a 
65 substrate. (See, e.g., Baldeschweilcr, supra.) An array analo- 
gous to a dot or slot blot may also be used to arrange and link 
elements to the surface of a substrate using thermal, UV, 
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chemical, or mechanical bonding procedures. A typical array 3x10^ cells are iransfected by electroporation with 20 f.ig of 

may be produced by hand or using available methods and CMV-hemagglutinin-FE65 plasm id and 20 //g of CMV- 

machines and contain any appropriate number of elements. EVLl plasmid as known in the art. 62 h after the 

After hybridization, nonhybridized probes are removed and transfection, the cells are harvested in ice-cold PBS and 

a scanner used to determine the levels and patterns of 5 centrifuged at 2000 rpm at 4° C, and the pellet is dissolved 

fluorescence. ITie degree of complementarity and the rela- in lysis bufler (H) mM Iris HCl, pH 7.5; 150 mM NaCl; 0.1 

live abundance of each probe which hybridizes to an ele- mM sodium vanadate; 50 mM NaF; 0.5% Nonidel F-40; 1 

ment on the microarray may be assessed through analysis of mM phenylmethylsulfonyl fluoride; 10 //g/ml each of 

the scanDcd images. aprotinin, leupeptin, and pepstatin). The extracts are clari- 

FuU-length cDNAs, Expressed Sequence Tags (ESTs), or fied by centrifugation at 16,000 g at 4° C, and 4 mg of 

fragments thereof may comprise the elements of the supernatant arc incubated for 1 h at 4* C with an anti- 

microarray. Fragments suitable for hybridization can be hemagglutinin monoclonal antibody or with an unrelated 

selected using software well known in the art such as monoclonal antibody. Thereafter, 30 /d of protein 

LASERGENE. Full-length cDNAs, ESTs, or fragments A-SEPHAROSE resin (Pharmacia) are added to each 

thereof corresponding to one of the nucleotide sequences of ^5 sample of the extract-antibody mixture, and the immuno- 

the present invention, or selected at random from a cDNA complexes are eluted with 50 mM Tris HQ, pH 6.8, 2% 

library relevant to the present invention, are arranged on an SDS, 10% glycerol. 100 mM dithiothreitol, 0.01% bromphe- 

appropriate substrate, e.g., a glass slide. The cDNA is fixed "ol blue. ITie proteins are resolved by 7,5% SDS-PAGE and 

to the sUde using, e.g., UV cross-linking followed by ther- transferred to Immobilon-P membranes (Millipore). ITie 

mal and chemical treatments and subsequent drying. (See, 20 filter is blocked in 5% nonfat dry milk in tris-buffered saline, 

e.g., Schena, M. et al. (1995) Science 270:467-470; and 0.5 % Tween, bufl^er (TBS-T) and incubated with anti-EVLl 

Shalon, D. et al. (1996) Genome Res. 6:639-645.) Fluores- antibodies at 1:1000 dilution for 1 h at room temperature, 

cent probes are prepared and used for hybridization to the ^t^r washing in TBS-T, the filter is exposed to horseradish 

elements on the substrate. The substrate is analyzed by peroxidase-conjugated protein A (Amersham Corp.) at a 

procedures described above. 25 dilution of 1:5000 for 30 min at room temperature. The 

signals are detected by chemi luminescence using the ECL 

VIII. Complementary Polynucleotides system (Amersham Corp.). The signal response is propor- 

Sequences complementary to the EVLl-encoding ''"""^ °^ ™ "'^ preparation, 

sequences, or any parts thereof, are used to delect, decrease, ^ XI. Production of EVLl Specific Antibodies 

or inhibit expression of naturally occurring EVLl. Although substantiaUy purified using PAGE electrophoresis 

use of ohgonucleoUdescompnsmg from about 1^^ Harrington, M. G. (1990) Methods Enzymol. 

pa,^ is described essentially the same procedure is used y^2-A^^^95\ or other purification techniques, is used to 

with smaUer or with larger sequence fragment^ Appropriate .^^^^^^ ^J^^.^^ ^^^^^^ ,nX^bo6yts using standard 

oligonucleotides are designed usmg OLIGO 4.06 software ^^^^^^^^ ^^^1 amino acid sequence is analyzed using 

and the coding sequence of E>^1. To mhibit transcnpUon, ^nASTAR software (DNASTAR Inc) to determine regions 

a complementary oligonucleotide is designed from the most ^-^ immunogenicity, and a corresponding oligopeptide 

unique 5' sequence and used to prevent promoter bmding to ^ synthesized and used to raise antibodies by means known 

the codmg sequence. To mhibit translation, a complemen- ^^^^ ^j^.^ ^^^^^^ ^j^^^-^^ 

tary oligonucleotide IS designed to prevent ri^^^ ^^^^ - ^^^^ ^ ^^^^ ^^^^ C-terninus or in 

mg to the EVLl ^ncodmg transcript. hydrophilic regions are well described in the art. (See, e.g., 

IX. Expression of EVLl Ausubel et al. , ch. 11.) 

Typically, the oligopeptides are 1 5 residues in length, and 

Expression of EVLl is accomplished by subclomng the synthesized using an Applied Biosystems 431 A Peptide 

cDNA into an appropriate vector and transforming the 45 Synthesizer using finoc-chemistry and coupled to KLH 

vector into host ccUs. This vector contams an appropnate (Sigma, St. Louis, Mo.) by reaction with 

promoter, e.g.. p-galactosidase upstream of the clomng site, N-maleimidobenzoyl-N-hydroxvsuccinimide ester (MBS) 

operably associated with the cDNA of interest. (See, e.g., ^^^^^^^ immunogenicitv. (Sec, e.g., Ausubel et al. supra.) 

Sambrook, supra pp. 404433; and Rosenberg, M. et al. Rabbits arc immunized with the oligopeptide-KLH complex 

(1983) Methods Enzymol. 101:123-138.) 5^ -^^ complete Freund's adjuvant. Resulting antisera are tested 

Induction of an isolated, transformed bacterial strain with for antipeptide activity, for example, by binding the peptide 

isopropyl beta-D-thiogalactopyranoside (IPTO) using stan- to plastic, blocking with 1% BSA, reacting with rabbit 

dard methods produces a fusion protein which c-on.sisLs of antisera, washing, and reacting with radio-iodinated goat 

the first 8 residues of p-galactosidaM;, about 5 to 1 5 residues anti-rabbit IgG. 

of linker, and the full length protein. The signal residues 55 

direct the secretion of EVLl into bacterial growth media Purification of Naturally Occurring EVLl 

which can be used directly in the following assay for ^^^^g Specific Antibodies 

activity. Naturally occurring or recombinant EVLl is substantially 

purified by immunoaffinity chromatography using antibod- 

X. Demonstration of EVLl Activity specific for EVLl. An immunoaffinity column is con- 

The assay for human enaA^ASP-like protein splice variant structed by covalently coupling anti-EVLl antibody to an 

(EVLl) is based upon the binding affinity of mouse ena/ activated chromatographic resin, such as CNBr-activated 

VASP for the neural protein FE65. (Ermekova, K. S. et al. Sepharose (Pharmacia & Upjohn). After the coupling, the 

supra.) Monkey cell line COS-7 (ATCC CRL 165 1) cells are resin is blocked and washed according to the manufacturer's 

grown in Dulbecco's modified Eagle's medium supple- 65 instructions. 

mented with 10% fetal bovine scrum and 1% penicillin/ Media containing EVLl are passed over the immunoaf- 

streptomycin mixture in 5% CO2 atmosphere at 37® C. finity column, and the column is washed under conditions 
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that allow the preferential absorbance of EVLl (e.g., high 
ionic strength buffers in the presence of detergent). The 
column is eluted under conditions that disrupt antibody/ 
EVLl binding (e.g., a buffer of pH 2 to pH 3, or a high 
concentration of a chaotrope, such as urea or thiocyanate 
ion), and EVI..1 is collected. 

XIII. Identification of Molecules Which Interact 
with EVLl 

EVLl, or biologically active fragments thereof, are 
labeled with ^^'^I Bolton-Hunter reagent. (See, e.g., Bolton et 
al. (1973) Biochem. J. 133:529.) Candidate molecules pre- 
viously arrayed in the wells of a multi-well plate are 
incubated with the labeled EVLl, washed, and any wells 
with labeled EVLl complex are assayed. Data obtained 
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using different concentrations of EVLl are used to calculate 
values for the number, afiBnity, and association of EVLl with 
the candidate molecules. 

Various modifications and variations of the described 
methods and systems of the invention will be apparent to 
those skilled in the art without departing from the scope and 
spirit of the invention. Although the invention has been 
described in connection with specific preferred 
embodiments, it should be understood that the invention as 
claimed should not be unduly limited to such specific 
embodiments. Indeed, various modifications of the 
described modes for carrying out the invention which are 
obvious to those skilled in molecular biology or related 
fields are intended to be within the scope of the following 
claims. 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(ill) NUMBER OF SEQUENCES: 4 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAONOT03 
<B) CLONE: 3089412 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Met Ala Thr Ser Glu Gin Ser lie Cys Gin Ala Arg Ala Ser Val Met 
15 10 15 

Val Tyr Asp Asp Thr Ser Lys Lys Trp Val Pro lie Lys Pro Gly Gin 
20 25 30 

Gin Gly Phe Ser Arg lie Asn lie Tyr His Asn Thr Ala Ser Asn Thr 
35 40 45 

Phe Arg Val Val Gly Val Lys Leu Gin Asp Gin Gin Val Val lie Asn 
50 55 60 

Tyr Ser lie Val Lys Gly Leu Lys Tyr Asn Gin Ala Thr Fro Thr Phe 
65 70 75 80 

His Gin Trp Arg Asp Ala Arg Gin Val Tyr Gly Leu Asn Phe Ala Ser 
85 90 95 

Lys Glu Glu Ala Thr Thr Fhe Ser Asn Ala Met Leu Phe Ala Leu Asn 

100 105 110 

lie Met Asn Ser Gin Glu Gly Gly Pro Ser Ser Gin Arg Gin Val Gin 
115 120 125 

Asn Gly Pro Ser Pro Asp Glu Met Asp lie Gin Arg Arg Gin Val Met 
130 135 140 

Glu Gin His Gin Gin Gin Arg Gin Glu Ser Leu Glu Arg Arg Thr Ser 
145 150 155 160 

Ala Thr Gly Pro lie Leu Pro Pro Gly His Pro Ser Ser Ala Ala Ser 

165 170 175 

Ala Pro Val Ser Cys Ser Gly Pro Pro Pro Pro Pro Pro Pro Leu Val 
ISO 185 190 



Pro Pro Pro Pro Thr Gly Ala Thr Pro Pro Pro Pro Pro Pro Leu Pro 
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-continued 



195 200 205 

Ala Gly Gly Ala Gin Gly Ser Ser His Asp Glu Ser Ser Met Ser Gly 
210 215 220 

Leu Ala Ala Ala lie Ala Gly Ala Lys Leu Arg Arg Val Gin Arg Pro 
225 230 235 240 

Glu Asp Ala Ser Gly Gly Ser Ser Pro Ser Gly Thr Ser Lys Ser Asp 
245 250 255 

Ala Asn Arg Ala Ser Ser Gly Gly Gly Gly Gly Gly Leu Met Glu Glu 
260 265 270 

Met Asn Lys Leu Leu Ala Lys Arg Arg Lys Ala Ala Ser Gin Ser Asp 
275 280 285 

Lys Pro Ala Glu Lye Lys Glu Asp Glu Ser Gin Met Glu Asp Pro Ser 

290 295 300 

Thr Ser Pro Ser Pro Gly Thr Arg Ala Ala Ser Gin Pro Pro Asn Ser 
305 310 315 320 

Ser Glu Ala Gly Arg Lys Pro Trp Glu Arg Ser Asn Ser Val Glu Lys 
325 330 335 

Pro Val Ser Ser lie Leu Ser Arg Thr Pro Ser Val Ala Lys Ser Pro 
340 345 350 

Glu Ala Lys Ser Pro Leu Gin Ser Gin Pro His Ser Arg Met Lys Pro 
355 360 365 

Ala Gly Ser Val Asn Asp Met Ala Leu Asp Ala Phe Asp Leu Asp Arg 
370 375 380 

Met Lys Gin Glu He Leu Glu Glu Val Val Arg Glu Leu His Lys Val 
385 390 395 400 

Lys Glu Glu He He Asp Ala He Arg Gin Glu Leu Ser Gly He Ser 
405 410 415 

Thr Thr 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1889 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

<vii) IMMEDIATE SOURCE: 

(A) LIBRARY: HEAONOT03 

(B) CLONE: 3089412 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

TTTAAGTAG6 CTATAAAAAT CAAGTTGCTG TCTTCAGAGG GTCTGTGGTC CTCTGATCAA 60 

CATAGGCTGG TGGGAGTACA GGACTCGCCT CCTCAGGGTT CCCTGTGCTG CCACTTTTCA 120 

GCCATGGCCA CAAGTGAACA GAGTATCTGC CAAGCCCGGG CTTCCGTGAT GGTCTACGAT 180 

GACACCAGTA AGAAATGGGT ACCAATCAAA CCTGGCCAGC AGGGATTCAG CCGGATCAAC 240 

ATCTACCACA ACACTGCCAG CAACACCTTC AGAGTCGTTG GAGTCAAGTT GCAGGATCAG 300 

CAGGTTGTGA TCAATTATTC AATCGTGAAA GGGCTGAAGT ACAATCAGGC CACGCCAACC 360 

TTCCACCAGT GGCGAGATGC CCGCCAGGTC TACGGCTTAA ACTTTGCAAG TAAAGAAGAG 420 

GCAACCACGT TCTCCAATGC AATGCTGTTT GCCCTGAACA TCATGAATTC CCAAGAAGGA 480 

GGCCCCTCCA GCCAGCGTCA GGTGCAGAAT GGCCCCTCTC CTGATGAGAT GGACATCCAG 540 

AGAAGACAAG TGATGGAGCA GCACCAGCAG CAGCGTCAGG AATCTCTAGA AAGAAGAACC 600 

TCGGCCACAG GGCCCATCCT CCCACCAGGA CATCCTTCAT CTGCAGCCAG CGCCCCCGTC 660 
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TCATGTAGTG GGCCTCCACC GCCCCCCCCA CCTCTAGTCC CACCTCCACC CACTGGGGCT 720 

ACCCCACCTC CCCCACCCCC ACTGCCAGCC GGAGGAGCCC AGGGGTCCAG CCACGACGAG 780 

AGCTCCATGT CAGGACTGGC CGCTGCCATA GCTGGGGCCA AGCTGAGAAG AGTCCAACGG 840 

CCAGAAGACG CATCTGGAGG CTCCAGTCCC AGTGGGACCT CAAAGTCCGA TGCCAACCGG 900 

GCAAGCAGCG GGGGTGGCGG AGGAGGCCTC ATGGAGGAAA TGAACAAACT GCTGGCCAAG 960 

AGGAGAAAAG CAGCCTCCCA GTCAGACAAG CCAGCCGAGA AGAAGGAAGA TGAAAGCCAA 1020 

ATGGAAGATC CTAGTACCTC CCCCTCTCCG GGGACCCGAG CAGCCAGCCA GCCACCTAAC 1080 

TCCTCAGAGG CTGGCCGGAA GCCCTGGGAG CGGAGCAACT CGGTGGAGAA GCCTGTGTCC 1140 

TCGATTCTGT CCAGAACCCC GTCTGTGGCA AAGAGCCCCG AAGCTAAGAG CCCCCTTCAG 1200 

TCGCAGCCTC ACTCTAGGAT GAAGCCTGCT GGGAGCGTGA ATGACATGGC CCTGGATGCC 1260 

TTCGACTTGG ACCGGATGAA GCAGGAGATC CTAGAGGAGG TGGTGAGAGA GCTCCACAAG 1320 

GTGAAGGAGG AGATCATCGA CGCCATCAGG CAGGAGCTGA GTGGGATCAG CACCACGTAA 1380 

GGGGCCGGCC TCGCTGCGCT GATTCGTCGA GCCCATCCGG CGACAGAGGA CAGCCAGAAG 144 0 

CCCAGCCAGC CCCAGACTCC AGTGCACCAG AGCACGCACA GGAGCCTGGG CGCGCTGCTG 1500 

TGAAACGTCC TGACCTGTGA TCACACATGA CAGTGAGGAA ACCAAGTGCA ACTCCTGGGT 1560 

TTTTTTTAGA TTCTGCCTGA CACGGAACAC CAGGTCTGCT CGTCTTTTTT GTGTTTTATA 1620 

TTTGCTTATT TAAGGTACAT TTCTTTGGGT TTCTAGAGAC GCCCCTAAGT CACCTGCTTC 1680 

ATTAGACGGT TTCCAGGTTT TCTCCCAGGT GACGCTGTTA GCGCCTCAGC TGGCGGTGAC 174 0 

AGCCGGCCCA GCGTGGCGCC ACCACACACC GCAGAGCTGT CCAGGCACAG CTCCGTCCCC 1800 

AGCGCTCATG GTGTTGAAAC TGTCTGTCAT GCACCACGGT GTCTGTGTCC ACACAGTAAT 1860 

AAACGGTTTA CTGTCCGCAA AAAAAAAAA 1889 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 amino acids 

(B) TYPE: amino acid 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank 

(B) CLONE: 164 4 453 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

Met Ser Glu Gin Ser He Cys Gin Ala Arg Ala Ser Val Met Val Tyr 

15 10 15 

Asp Asp Thr Ser Lys Lys Trp Val Pro He Lys Pro Gly Gin Gin Gly 
20 25 30 

Phe Ser Arg He Asn He Tyr His Asn Thr Ala Ser Ser Thr Phe Arg 
35 40 45 

Val Val Gly Val Lys Leu Gin Asp Gin Gin Val Val He Asn Tyr Ser 
50 55 60 

He Val Lys Gly Leu Lys Tyr Asn Gin Ala Thr Pro Thr Phe His Gin 
65 70 75 80 

Trp Arg Asp Ala Arg Gin Val Tyr Gly Leu Asn Phe Ala Ser Lys Glu 
85 90 95 

Glu Ala Thr Thr Phe Ser Asn Ala Met Leu Phe Ala Leu Asn He Met 
100 105 110 
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Aen Ser Gin Glu Gly Gly Pro Ser Thr Gin Arg Gin Val Gin Asn Gly 

115 120 125 

Pro Ser Pro Glu Glu Met Asp He Gin Arg Arg Gin Val Met Glu Gin 
130 135 140 

Gin His Arg Gin Glu Ser Leu Glu Arg Arg He Ser Ala Thr Gly Pro 
145 150 155 160 

He Leu Pro Pro Gly His Pro Ser Ser Ala Ala Ser Thr Thr Leu Ser 
165 170 175 

Cys Ser Gly Pro Pro Pro Pro Pro Pro Pro Pro Val Pro Pro Pro Pro 
180 185 190 

Thr Gly Ser Thr Pro Pro Pro Pro Pro Pro Leu Pro Ala Gly Gly Ala 
195 200 205 

Gin Gly Thr Aen His Asp Glu Ser Ser Ala Ser Gly Leu Ala Ala Ala 
210 215 220 

Leu Ala Gly Ala Lys Leu Arg Arg Val Gin Arg Pro Glu Asp Ala Ser 
225 230 235 240 

Gly Gly Ser Ser Pro Ser Gly Thr Ser Lys Ser Asp Ala Asn Arg Ala 

245 250 255 

Ser Ser Gly Gly Gly Gly Gly Gly Leu Met Glu Glu Met Asn Lys Leu 
260 265 270 

Leu Ala Lys Arg Arg Lys Ala Ala Ser Gin Thr Asp Lys Pro Ala Asp 
275 280 285 

Arg Lys Glu Asp Glu Ser Gin Thr Glu Asp Pro Ser Thr Ser Pro Ser 
290 295 300 

Pro Gly Thr Arg Ala Thr Ser Gin Pro Pro Asn Ser Ser Glu Ala Gly 

305 310 315 320 

Arg Lys Pro Trp Glu Arg Ser Asn Ser Val Glu Lys Pro Val Ser Ser 
325 330 335 

Leu Leu Ser Arg Val Lys Pro Ala Gly Ser Val Asn Asp Val Gly Leu 
340 345 350 

Asp Ala Leu Asp Leu Asp Arg Met Lys Gin Glu He Leu Glu Glu Val 
355 360 365 

Val Arg Glu Leu His Lys Val Lys Glu Glu He He Asp Ala He Arg 

370 375 380 

Gin Glu Leu Ser Gly He Ser Thr Thr 
385 390 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 380 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESSi single 

(D) TOPOLOGY: linear 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank 

(B) CLONE: 624964 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Glu Thr Val He Cys Ser Ser Arg Ala Thr Val Met Leu Tyr 
15 10 15 

Asp Asp Gly Asn Lys Arg Trp Leu Pro Ala Gly Thr Gly Pro Gin Ala 

20 25 30 

Phe Ser Arg Val Gin He Tyr His Asn Pro Thr Ala Asn Ser Phe Arg 
35 40 45 

Val Val Gly Arg Lys Met Gin Pro Asp Gin Gin Val Val He Asn Cys 
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50 








55 










60 










Ala 
65 


He Val 


Arg 


Gly 


Val 

70 


Lys 


Tyr 


Asn 


Gin 


Ala 

75 


Thr 


Pro 


Asn 


Phe 


His 
80 


Gin 


Trp Arg 


Asp 


Ala 

85 


Arg 


Gin 


Val 


Trp 


Gly 
90 


Leu 


Asn 


Phe 


Gly 


Ser 
95 


Lys 


Glu 


Asp Ala 


Ala 
100 


Gin 


Phe 


Ala 


Ala 


Gly 
105 


Met 


Ala 


Ser 


Ala 


Leu 
110 


Glu 


Ala 


Leu 


Glu Gly 
US 


Gly 


Gly 


Pro 


Pro 


Pro 
120 


Pro 


Pro 


Ala 


Leu 


Pro 
125 


Thr 


Trp 


Ser 


Val 


Pro Asn 
130 


Gly 


Pro 


Ser 


Pro 
135 


Glu 


Glu 


Val 


Glu 


Gin 
140 


Gin 


Lys 


Arg 


Gin 


Gin 
145 


Pro Gly 


Pro 


Ser 


Glu 
150 


His 


He 


Glu 


Arg 


Arg 
155 


Val 


Ser 


Asn 


Ala 


Gly 
160 


Gly 


Pro Pro 


Ala 


Pro 
165 


Pro 


Ala 


Gly 


Gly 


Pro 
170 


Pro 


Pro 


Pro 


Pro 


Gly 
175 


Pro 


Pro 


Pro Pro 


Pro 
180 


Gly 


Pro 


Pro 


Pro 


Pro 
185 


Pro 


Gly 


Leu 


Pro 


Pro 
190 


Ser 


Gly 


Val 


Pro Ala 
195 


Ala 


Ala 


His 


Gly 


Ala 
200 


Gly 


Gly 


Gly 


Pro 


Pro 
205 


Pro 


Ala 


Pro 


Pro 


Leu Pro 
210 


Ala 


Ala 


Gin 


Gly 
215 


Pro 


Gly 


Gly 


Gly 


Gly Ala 

220 


Gly 


Ala 


Pro 


Gly 
225 


Leu Ala 


Ala 


Ala 


lie 
230 


Ala 


Gly 


Ala 


Lys 


Leu 
235 


Arg Lys 


Val 


Ser 


Lys 
240 


Gin 


Glu Glu 


Ala 


Ser 
245 


Gly 


Gly 


Pro 


Thr 


Ala 
250 


Pro 


Lys Ala 


Glu 


Ser 
255 


Gly 


Arg 


Ser Gly 


Gly 
260 


Gly 


Gly 


Leu 


Met 


Glu 
,265 


Glu 


Met 


Aan 


Ala 


Met 
270 


Leu 


Ala 


Arg 


Arg Arg 
275 


Lys 


Ala 


Thr 


Gin 


Val 

280 


Gly 


Glu 


Lys 


Thr 


Pro 

285 


Lys 


Asp 


Glu 


Ser 


Ala Asn 
290 


Gin 


Glu 


Glu 


Pro 
295 


Glu 


Ala 


Arg 


Val 


Pro 
300 


Ala 


Gin 


Ser 


Glu 


Ser 
305 


Val Arg 


Arg 


Pro 


Trp 
310 


Glu 


Lys 


Asn 


Ser 


Thr 
315 


Thr 


Leu 


Pro 


Arg 


Met 
320 


Lys 


Ser Ser 


Ser 


Ser 
325 


Val 


Thr 


Thr 


Ser 


Glu 
330 


Thr 


Gin 


Pro 


Cys 


Thr 
335 


Pro 


Ser 


Ser Ser 


Asp 

340 


Tyr 


Ser 


Asp 


Leu 


Gin 
345 


Arg 


Val 


Lys Gin 


Glu 

350 


Leu 


Leu 


Glu 


Glu Val 
355 


Lys 


Lys 


Glu 


Leu 


Gin 
360 


Lys 


Val 


Lys 


Glu 


Glu 
365 


He 


He 


Glu 


Ala 


Phe Val 
370 


Gin 


Glu 


Leu 


Arg 
375 


Lys 


Arg 


Gly 


Ser 


Pro 
380 











What is claimed is: 
1. A substantially purified polypeptide comprising an 
amino acid sequence selected from the group consisting of 

a) an amino acid sequence of SEQ ID N0:1, 

b) a naturally-occurring amino acid sequence having at 
least 95% sequence identity to the sequence of SEQ ID 



55 N0:1, wherein said amino acid sequence encodes a 
polypeptide that binds the neural protein FE65. 
2. A composition comprising a polypeptide of claim 1 in 
conjunction with a suitable pharmaceutical carrier. 
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TESTIS SPECIFIC GLYCOPROTEIN ZPEPIO 

REFERENCE TO RELATED APPLICATIONS 

This application is a divisional of application Ser. No. 
09/441346 filed Nov. 16, 1999 now issued as U.S. Pat. No. 
6,242,588 which is related to Provisional Application No. 
60/109,216, filed on Nov. 20, 1998. Under 35 U.S.C. § 
119(e)(1), this application claims benefit of said Provisional 
Application. 

BACKGROUND OF THE INVENTION 

The testis is the center for spermatogenesis, the process by 
which a germ cell proceeds through multiple stages of 
differentiation, and culminates in the formation of a termi- 
nally differentiated cell (spermatozoa or sperm) having a 
unique function. Within the testis are seminiferous tubules, 
where spermatogonium mature into ^ermatozoa. Surround- 
ing the seminiferous tubules are interstitial cells which 
secrete androgens, such as testosterone, required for matu- 
ration and function of the testis and development of sec- 
ondary sexual characteristics. Disorders of the testis are 
common and have profound effect. Infertility can resuU from 
disorders occurring during spermatogenesis. Many develop- 
mental disorders, such as hypogonadism, are associated with 
altered sex hormone production and levels in the testis. 
Testicular cancer, although rare, is the most common form of 
cancer in young men between the ages of 15 and 35. 

Testis specific proteins have therapeutic value in the 
treatment of disorders associated with the testis such as 
dysfunctional sperm production, infertility and testicular 
cancer. Towards this end, the present invention provides 
novel teslis-specific membrane glycoproteins, soluble 
ligands, agonists and antagonists, related compositions and 
methods as well as other uses that should be apparent to 
those skilled in the art from the teachings herein. 

BRIEF DESCRIPTION OF THE DRAWING 

FIGS. lA-K are a Hopp/Woods hydrophilicity profile of 
the amino acid sequence shown in SEQ ID N0:2. The profile 
is based on a sliding six-residue window. Buried G, S, and 
T residues and exposed H, Y, and W residues were ignored. 
These residues are indicated in the FIG. by lower case 
letters. 

DETAILED DESCRIPTION OF THE 
INVENTION 

Prior to setting forth the invention in detail, it may be 
helpful to the xmderstanding thereof to define the following 
terms: 

The term "affinity tag" is used herein to denote a peptide 
segment that can be attached to a polypeptide to provide for 
purification or detection of the polypeptide or provide sites 
for attachment of the polypeptide to a substrate. In principal, 
any peptide or protein for which an antibody or other 
specific binding agent is available can be used as an aflSnity 
tag. Affinity tags include a poly-histidine tract, protein A 
(Nilsson et al., EMBO J, 4:1075, 1985; Nilsson et al., 
Methoiis Enzymol 198:3, 1991), glutathione S transferase 
(Smith and .Tohnson, Gene 67:31, 1988), Glu— Glu allinity 
tag (Grussenmeyer et al, Proc. Natl. Acad. Sci. USA 
82:7952-4, 1985), substance P, Flag™ peptide (Hopp et al.. 
Biotechnology 6:1204-10, 1988; available from Eastman 
Kodak Co., New Haven, Conn.), strcptavidin binding 
peptide, or other antigenic epitope or binding domain. See, 
in general Ford et al., Protein Expression and Purification 2: 
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95-107, 1991. DNAs encoding affinity tags are available 
from commercial suppliers (e.g., Pharmacia Biotech, 
Piscataway, N.J.). 

The term "allelic variant" denotes any of two or more 
5 alternative forms of a gene occupying the same chromo- 
somal locus. Allelic variation arises naturally through 
mutation, and may result in phenotypic polymorphism 
within populations. Gene mutations can be silent (no change 
in the encoded polypeptide) or may encode polypeptides 
having altered amino acid sequence. The term allelic variant 
is also used herein to denote a protein encoded by an allelic 
variant of a gene. 

The terms "amino-terminal" and "carboxyl-terminal" are 
used herein to denote positions within polypeptides and 
proteins. Where the context allows, these terms are used 
with reference to a particular sequence or portion of a 
polypeptide or protein to denote proximity or relative posi- 
tion. For example, a certain sequence positioned carboxyl- 
terminal to a reference sequence within a protein is located 
proximal to the carboxyl terminus of the reference sequence, 
but is not necessarily at the carboxyl terminus of the 
complete protein. 

The term "complements of polynucleotide molecules" 
denotes polynucleotide molecules having a complementary 
2^ base sequence and reverse orientation as compared to a 
reference sequence. For example, the sequence 5' ATG- 
CACGGG 3' is complementary to 5' CCCGTGCAT 3'. 

The term "contig" denotes a polynucleotide that has a 
contiguous stretch of identical or complementary sequence 
3Q to another polynucleotide. Contiguous sequences are said to 
"overlap" a given stretch of polynucleotide sequence either 
in their entirety or along a partial stretch of the polynucle- 
otide. For example, representative contigs to the polynucle- 
otide sequence 5'-ArGGCTTAGCrr-3' (SEQ ID N0:12) 
35 are 5'-TAGCTTgagtct-3* (SEQ ID NO: 13) and 
3'-gtcgacTACCGA-5' (SEQ ID NO: 14). 

The term "degenerate nucleotide sequence" denotes a 
sequence of nucleotides that includes one or more degener- 
ate codons (as compared to a reference polynucleotide 
40 molecule that encodes a polypeptide) Degenerate codons 
contain different triplets of nucleotides, but encode the same 
amino acid residue (i.e., GAU and GAG triplets each encode 
Asp). 

TTie term "expression vector** denotes a DNA molecule, 

45 linear or circular, that comprises a segment encoding a 
polypeptide of interest operably linked to additional seg- 
ments that provide for its transcription. Such additional 
segments may include promoter and terminator sequences, 
and may optionally include one or more origins of 

50 replication, one or more selectable markers, an enhancer, a 
polyadenylation signal, and the like. Expression vectors are 
generally derived from plasmid or viral DNA, or may 
contain elements of both. 

The term "isolated", when applied to a polynucleotide, 

55 denotes that the polynucleotide has been removed from its 
natural genetic milieu and is thus free of other extraneous or 
unwanted coding sequences, and is in a form suitable for use 
within genetically engineered protein production systems. 
Such isolated molecules are those that are separated from 

60 their natural environment and include cDNA and genomic 
clones. Isolated DNA molecules of the present invention are 
free of other genes with which they are ordinarily associated, 
but may include naturally occurring 5' and 3' untranslated 
regions such as promoters and terminators. The identifica- 

65 tion of associated regions will be evident to one of ordinary 
skill in the art (see for example, Dynan and Tijan, Nature 
316:774-78, 1985). 
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An "isolated" polypeptide or protein is a polypeptide or sequences are commonly, but not always, found in the 5' 

protein that is found in a condition other than its native non-coding regions of genes. 

environment, such as apart from blood and animal tissue. In A "protein" is a macromolecule comprising one or more 
a preferred form, the isolated polypeptide is substantially polypeptide chains. A protein may also comprise non- 
free of other polypeptides, particularly other polypeptides of 5 peptidic components, such as carbohydrate groups. Carbo- 
animal origin. It is preferred to provide the polypeptides in hydrates and other non-peptidic substituents may be added 
a highly purified form, i.e. greater than 95% pure, more to a protein by the cell in which the protein is produced, and 
preferably greater than 99% pure. When used in this context, vary with the type of cell. Proteins are defined herem in 
the term "isolated" does not exclude the presence of the ^^rms of their amino acid backbone structures; substituents 
same polypeptide in alternative physical forms, such as lO such as carbohydrate groups are generally not specified, but 
dimers or alternatively glycosylated or dcrivatized forms. P'^^^"^^ nonetheless. 

^ 1., 1. . j« 1. r • . i^KTA ITie term "receptor" denotes a cell-associated protein that 

The term opcrably hnked , when refernng to DNA . • , , . , i , / , i\ i i * 

, ' , ' * J , ^ binds to a bioactive molecule (i.e., a Imand) and mediates 

segments, denotes that the segments are arranged so that ^^.^^^ ^^,„hrane-bound recep- 

Ihey fimcuon m concert for their mtended purposes, e.g characterized by a multi-domain structure compris- 

fte'^frL'°"Sno"the'te™Sor " extracellular ligand-binding domain and an intracel- 

t e CO ing segment to t e termmator. j^j^^ effector domain that is typically involved in signal 

The term "ortholog" denotes a polypeptide or protein transduction. Binding of ligand to receptor results in a 

obtained from one species that is the functional counterpart conformational change in the receptor that causes an inter- 

of a polypeptide or protein from a diflerent species. action between the effector domain and other molecule(s) in 

Sequence differences among orthologs are the result of t^e cell. This interaction in turn leads to an alteration in the 

speciation. metabolism of the cell. Metabolic events that are linked to 

llie term "polynucleotide" denotes a single- or double- receptor-ligand interactions include gene transcription, 

stranded polymer of deoxyribonucleotide or ribonuclecilide phosphorylation, dephosphorylatiou, increases in cyclic 

bases read from the 5' to the 3' end. Polynucleotides include AMP production, mobihzation of cellular calcium, mobili- 

RNAand DNA, and may be isolated from natural sources, " nation of membrane lipids, cell adhesion, hydrolysis of 

synthesized in vitro, or prepared from a combination of inositollipids and hydrolysis of phospholipids. Most nuclear 

natural and synthetic molecules. Sizes of polynucleotides receptors also exhibit a multi-domain structure, including an 

arc expressed as base pairs (abbreviated "bp"), nucleotides amino-terminal, iransactivating domain, a DNA binding 

("nt"), or kilobases ("kb"). Where the context allows, the domain and a ligand binding domain. In general, receptors 

latter two terms may describe polynucleotides that are can be membrane bound, cytosolic or nuclear; monomeric 

single-stranded or double -stranded. When the term is (e.g., thyroid stimulating hormone receptor, beta-adrenergic 

applied to double-stranded molecules it is used to denote receptor) or multimeric (e.g., PDGF receptor, growth hor- 

overall length and will be understood to be equivalent to the mone receptor, IL-3 receptor, GM-CSF receptor, G-CSF 

term "base pairs". It will be recognized by those skilled in receptor, erythropoietin receptor and IL-6 receptor), 

the art that the two strands of a double-stranded polynucle- ' y^e term "secretory signal sequence" denotes a DNA 

olide may differ slightly in length and that the ends thereof sequence that encodes a polypeptide (a "secretory peptide") 

may be staggered as a result of enzymatic cleavage; thus all j^at, as a component of a larger polypeptide, directs the 

nucleotides within a double-stranded polynucleotide mol- larger polypeptide through a secretory pathway of a cell in 

ecule may not be paired. Such unpaired ends will in general ^ ^vhich it is synthesized. The larger peptide is commonly 

not exceed 20 nt in length. cleaved to remove the secretory peptide during transit 

A "polypeptide" is a polymer of amino acid residues through the secretory pathway, 

joined by peptide bonds, whether produced naturally or The term "soluble receptor" is used herein to refer to a 

synthetically. Polypeptides of less than about 10 amino acid receptor polypeptide that is not bound to a cell membrane, 

residues arc commonly referred to as "peptides". 45 Soluble receptors are most commonly ligand-binding reccp- 

"Probes and/or primers" as used herein can be RNA or tor polypeptides that lack transmembrane and cytoplasmic 

DNA. DNA can be either cDNA or genomic DNA. Poly- domains. Soluble receptors can comprise additional amino 

nucleotide probes and primers are single or double-stranded acid residues, such as afiSnity tags that provide for purifica- 

DNA or RNA, generally synthetic oligonucleotides, but may tion of the polypeptide or provide sites for attachment of the 

be generated from cloned cDNAor genomic sequences or its 50 polypeptide to a substrate. Many cell-surface receptors have 

complements. Analytical probes will generally be at least 20 naturally occurring, soluble counterparts that are produced 

nucleotides in length, although somewhat shorter probes by proteolysis or translated from alternatively spliced 

(14-17 nucleotides) can be used. PGR primers are at least 5 mRNAs. Receptor polypeptides are said to be substantially 

nucleotides in length, preferably 15 or more nt, more pref- free of transmembrane and intracellular polypeptide seg- 

erably 20-30 nt. Short polynucleotides can be used when a 55 ments when they lack suflBcient portions of these segments 

small region of the gene is targeted for analysis. For gross to provide membrane anchoring or signal transduction, 

analysis of genes, a polynucleotide probe may comprise an respectively. 

entire exon or more. Probes can be labeled to provide a The term "splice variant" is used herein to denote aller- 

detectable signal, such as with an enzyme, biotin, a native forms of RNA transcribed from a gene. Splice varia- 

radionuclide, fluorophore, chemiluminescer, paramagnetic (jo tion arises naturally through use of alternative splicing sites 

particle and the like, which are commercially available from within a transcribed RNA molecule, or less commonly 

many sources, such as Molecular Probes, Inc., Eugene, between separately transcribed RNA molecules, and may 

Oreg., and Amersham Corp., Arlington Heights, 111., using result in several mRNAs transcribed from the same gene, 

techniques that are well known in the art. Splice variants may encode polypeptides having ahered 

The term "promoter" denotes a portion of a gene con- 65 amino acid sequence. The term splice variant is also used 

taining DNA sequences that provide for the binding of RNA herein to denote a protein encoded by a splice variant of an 

polymerase and initiation of transcription. Promoter mRNA transcribed from a gene. 
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Molecular weights and lengths of polymers determined by 
imprecise analytical methods (e.g., gel electrophoresis) will 
be understood to be approximate values. When such a value 
is expressed as "about" X or "approximately" X, the stated 
value of X will be understood to be accurate to ±10%. 

All references cited herein are incorporated by reference 
in their entirety. 

Within one aspect the invention provides an isolated 
polypeptide comprising an extracellular domain, wherein 
the extracellular domain comprises amino acid residues 22 
to 111 of the amino acid sequence of SEQ ID N0:2. Within 
one embodiment polypeptide further comprises a transmem- 
brane domain that resides in a carboxyl-terminal position 
relative to the extracellular domain, wherein the transmem- 
brane domain comprises amino acid residues 112 to 133 of 
the amino acid sequence of SEQ ID NO:2. Within another 
embodiment the polypeptide further comprises a cytoplas- 
mic domain that resides in a carboxyl-terminal position 
relative to the transmembrane domain, wherein the cyto- 
plasmic domain comprises amino acid residues 134 to 142 
of the amino acid sequence of SEQ ID N0:2. Within another 
embodiment the polypeptide further comprises a secretory 
signal that resides in an amino-terminal position relative to 
the extracellular domain, wherein the secretory signal 
sequence comprises amino acid residues 1 to 20 of the amino 
acid sequence of SEQ ID NO: 2. 

The invention also provides an isolated polypeptide as 
described herein comprising amino acid residue 1 to amino 
acid residue 142 of SEQ ID N0:2. 

Also provided is an isolated polypeptide as described 
herein, covalently linked amino terminally or carboxy ter- 
minally to a moiety selected from the group consisting of 
afiinity tags, toxias, radionucleolides, enxymes and fluoro- 
phores. 

Within another aspect the invention provides an isolated 
polypeptide comprising a sequence of amino acid residues 
that is at least 80% identical to a amino acid residue 21 to 
amino acid residue 142 of SEQ ID N0:2, wherein the 
polypeptide specifically binds with an antibody that specifi- 
cally binds with a polypeptide having the amino acid 
sequence of SEQ ID N0:2. Within one embodiment any 
difference between the amino acid sequence of the isolated 
polypeptide and the corresponding amino acid sequence of 
SEQ ID NO: 2 is due to a conservative amino acid substi- 
tution. Within another embodiment the amino acid percent 
identity is determined using a FAS1 A program with ktup=l, 
gap opening penalty=10, gap extension penally=l, and sub- 
stitution matrix-blosum62, with other parameters set as 
default. 

The invention provides an isolated polypeptide compris- 
ing the amino acid sequence of amino acid residue 1 to 
amino acid residue 20 of SEQ ID N0:2. 

Also provided is an isolated polypeptide selected from the 
group consisting of: 

a) amino acid residues 21-111 of SEQ ID N0:2; 

b) amino acid residues 112-133 of SEQ ID N0:2; 

c) amino acid residues 134-142 of SEQ ID N0:2; 

d) amino acid residues 1-20 of SEQ ID N0:2; 

e) amino acid residues 21-133 of SEQ ID N0:2; 

f) amino acid residues 112-142 of SEQ ID N0:2; 

g) amino acid residues 1-111 of SEQ ID N0:2; and 

h) amino acid residues 1-133 of SEQ ID N0:2. 
Within another aspect the invention provides a fusion 

protein consisting of a first portion and a second portion 
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joined by a peptide bond, the first portion comprising a 
polypeptide as described herein, and the second portion 
comprising another polypeptide. 

The invention also provides a polypeptide as described 
5 herein in combination with a pharmaceutically acceptable 
vehicle. 

Within another aspect the invention provides an antibody 
that specifically binds to an epitope of a polypeptide of as 
described herein. Within one embodiment the antibody is 
selected from the group consisting of: a) polyclonal anti- 
body; b) murine monoclonal antibody; c) humanized anti- 
body derived from b); and d) human monoclonal antibody. 

Within another embodiment the antibody fragment is 
selected from the group consisting of F(ab% F(ab), Fab', 
Fab, Fv, scFv, and minimal recognition unit. 

^ 5 Also provides is an anti-idiotype antibody that specifically 
binds to an antibody as described herein. Also provided is a 
binding protein that specifically binds to an epitope of a 
polypeptide as described herein. 
Within another aspect the invention provides a method of 

20 producing an antibody to a polypeptide comprising: inocu- 
lating an animal with a polypeptide as described herein; 
wherein the polypeptide elicits an immune response in the 
animal to produce the antibody; and isolating the antibody 
from the animal. 

25 Within another aspect is provided an isolated polynucle- 
otide encoding a polypeptide comprising an extracellular 
domain, wherein the extracellular domain comprises amino 
acid residues 22 to 111 of the amino acid sequence of SEQ 
ID N0:2, Within one embodiment the polypeptide further 

30 comprises a transmembrane domain that resides in a 
carboxyl-terminal position relative to the extracellular 
domain, wherein the transmembrane domain comprLses 
amino acid residues 112 to 133 of the amino acid sequence 
of vSEQ ID NO. 2. Within another embodiment the polypep- 

35 tide further comprises a cytoplasmic domain that resides in 
a carboxyl-terminal position relative to the transmembrane 
domain, wherein the cytoplasmic domain comprises amino 
acid residues 134 to 142 of the amino acid sequence of SEQ 
ID N0:2. Within yet another embodiment the polypeptide 

40 further comprises a secretory signal that resides in an 
amino-terminal position relative to the extracellular domain, 
wherein the secretory signal sequence comprises amino acid 
residues 1 to 20 of the amino acid sequence of SEQ ID 
N0:2. 

45 The invention also provides an isolated polynucleotide as 
described herein encoding a polypeptide comprising amino 
acid residue 1 to amino acid residue 142 of SEQ ID NO:2. 

Also provided is an isolated polynucleotide as described 
herein, wherein the polypeptide is covalently linked amino 

50 terminally or carboxy terminally to a moiety selected from 
the group consisting of affinity tags, toxins, 
radionucleotidcs, enzymes and fluorophores. 

Within another aspect the invention provides an isolated 
polynucleotide encoding a polypeptide comprising a 

55 sequence of amino acid residues that is at least 80% identical 
to a amino acid residue 21 to amino acid residue 142 of SEQ 
ID N0:2, wherein the polypeptide specifically binds with an 
antibody that specifically binds with a polypeptide having 
the amino acid sequence of SEQ ID NO:2. Within one 

60 embodiment any difference between the amino acid 
sequence of the isolated polypeptide and the corresponding 
amino acid sequence of SEQ ID NO: 2 is due to a conser- 
vative amino acid substitution. Within another embodiment 
the amino acid percent identity is determined using a FASTA 

65 program with ktup=l, gap opening penalty=10, gap exten- 
sion penaltysl, and substitution matrix=blosum62, with 
other parameters set as default. 
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The invention also provides an isolated polynucleotide 
selected from the group consisting of: 

a) a sequence of nucleotides from nucleotide 139 to 
nucleotide 411 of SEQ ID N0:1; 

b) a sequence of nucleotides from nucleotide 139 to 
nucleotide 477 of SEQ ID N0:1; 

c) a sequence of nucleotides from nucleotide 139 to 
nucleotide 504 of SEQ ID N0:1; 

d) a sequence of nucleotides from nucleotide 79 to 
nucleotide 504 of SEQ ID N0:1; 

e) a sequence of nucleotides from nucleotide 1 to nucle- 
otide 1094 of SEQ ID N0:1; 

1) a polynucleotide that remains hybridized Following 
stringent wash conditions to a polynucleotide consist- 
ing of the nucleotide sequence of SEQ ID N0:1, or the 
complement of SEQ ID N0:1; and 

g) nucleotide sequences complementary to a), b), c), d), 
e), or f . 

Further provided is an isolated polynucleotide encoding a 
fusion protein consisting of a first portion and a second 
portion joined by a peptide bond, the first portion comprises 
a polypeptide as described herein; and the second portion 
comprising another polypeptide. 

Also provided is an isolated polynucleotide encoding a 
fusion protein comprising a secretory signal sequence hav- 
ing the amino acid sequence of amino acid residues 1-20 of 
SEO ID NO: 2, wherein the secretory signal sequence is 
operably linked to an additional polypeptide. 

The invention also provides an isolated polynucleotide 
comprising the sequence of nucleotide 1 to nucleotide 426 of 
SEQ ID N0:3. 

Within another aspect is provided an expression vector 
comprising the following operably linked elements: 

a transcription promoter; a DNA segment encoding a 
polypeptide as described herein; and a transcription termi- 
nator. 

Within one embodiment the DNA segment encodes a 
polypeptide covalently linked amino terminally or carboxy 
terminally to an affinity tag. Within another embodiment the 
DNA segment further encodes a secretory signal sequence 
operably linked to the polypeptide. Within yet another 
embodiment the secretory signal sequence comprises resi- 
dues 1 to 20 of SEQ ID N0:2. 

The invention also provides a cultured cell into which has 
been introduced an expression vector as described herein; 
wherein the cell expresses the polypeptide encoded by the 
DNA segment. 

The invention also provides a method of producing a 
polypeptide comprising: culturing a cell into which has been 
introduced an expression vector as described herein; 
whereby the cell expresses the polypeptide encoded by the 
DNA segment; and recovering the expressed polypeptide. 

The present invention is based in part upon the discovery 
of a novel DNA sequence (SEQ ID N0:1) and the corre- 
sponding deduced polypeptide sequence (SEQ ID N0:2) 
which encode a testis-specific polypeptide designated 
zpeplO. Jlie novel zpeplO polypeptide-encoding polynucle- 
otides of the present invention were initially identified by 
querying an EvST database for poly]:)eptides containing 
repetitive patterns and post-translational processing sites 
yielding potentially active peptides. The polypeptide corre- 
sponding to an EST meeting those search criteria was further 
analyzed and found to be a membrane glycoprotein. The 
EST sequence was from a testis cell library. Several clones 
considered likely to contain the entire coding region were 
used for sequencing and resulted in an incompletely spliced 
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message. A minimal nucleotide sequence having all poten- 
tial introns spliced out was generated. The full length cDNA 
sequence was identified from a testis library and is disclosed 
in SEQ ID N0:1 . ITie deduced amino acid sequence of this 

5 polynucleotide sequence is disclosed in SEQ ID N0:2. 
Analysis of the DNA encoding a zpeplO polypeptide (SEQ 
ID N0:1) revealed an open reading frame encoding 142 
amino acids (SEQ ID N0:2) comprising a putative signal 
sequence (residues 1 to 20 of SEQ ID N0:2, nucleotides 79 
to 138 of SEQ ID NO:l) and 122 amino acids of predicted 

^ mature sequence (residues 21 to 142 of SEQ ID N0:2, 
nucleotides 139 to 504 of SEQ ID N0:1) containing an 
extracellular domain (residues 21 to 111 of SEQ ID N0:2, 
nucleotides 139 to 411 of SEQ ID N0:1) containing six 
cysteine residues, amino acid residues 35, 45, 84, 87, 94 and 

1^ 100 of SEQ ID N0;2, a tri-basic amino acid cleavage site, 
amino acid residues 97-99 of SEQ ID N0:2; potential 
N-Iinked glycosylation sites at amino acid residues 83 and 
86 of SEQ ID N0:2; and potential 0-glycosylation sites at 
amino acid residues 28, 36, 48, 52, 60, 65, 68, 78, 79, 80, 85, 

20 86, 90, 93 and 104 of SEQ ID N0:2; a putative transmem- 
brane domain (residues 112 to 133 of SEQ ID N0:2, 
nucleotides 412 to 477 of SEQ ID N0:1) and a cytoplasmic 
domain (residues 134 to 142 of SEQ ID N0:2, nucleotides 
478 to 504 of SEQ ID NO:l). The overall structure of 

25 ZpeplO is helical. Those skilled in the art will recognize that 
these domain boundaries are approximate, and are based on 
alignments with known proteins and predictions of protein 
folding. ZpeplO does not share significant homology with 
any known protein. 

30 Many proteins and hormones are processed into their 
mature forms by highly-specific proteolytic enzymes, pro- 
hormone convertases, which carry out intracellular cleavage 
at the COOH-terminal side of dibasic sites within their 
substrate polypeptides. There are only a few dibasic amino 

35 acid combinations, including lys-lys, arg-arg, arg-lys and 
lys-arg. ZpeplO polypeptides may be processed into an 
active form through cleavage after lys (amino acid residue 
98 of SEQ ID N0:2) or arg (amino acid residue 99 of SEQ 
ID NO:2) of the tribasic site arg-lys-arg (amino acid residues 

40 97-99 of SEQ ID N0:2). Prohormone convertase PC4 
exhibits highly specific testis expression (WIPO publication, 
WO98/50560) and may serve to cleave the zpeplO polypep- 
tide. 

The present invention therefore provides post- 

45 translationally modified polypeptides or polypeptide frag- 
ments having the amino acid sequence from amino acid 
residue 21 to amino acid residue 98 of SEQ ID NO:2 and the 
amino acid sequence from amino acid residue 21 to amino 
acid residue 99 of SEQ ID NO: 2. Examples of post trans- 

50 lational modifications include proteolytic cleavage, glyco- 
sylation and disulfide bonding. 

Analysis of the tissue distribution of the mRNA corre- 
sponding to this novel DNA by Northern blot and Dot blot 
analysis suggest that zpeplO is a testis-specific protein 

55 having a transcript of about 1.5 kb. 

The present invention further provides polynucleotide 
molecules, including DNA and RNA molecules, encoding 
zpeplO proteins. Tho polynucleotides of the present inven- 
tion include the sense strand; the anti-sense strand; and the 

60 DNA as double-stranded, having both the sense and anti- 
sen.se strand annealed together by their respective hydrogen 
bonds. Representative DNA sequences encoding zpeplO 
proteins are set forth in SEQ ID N0:1. DNA sequences 
encoding other zpeplO proteins can be readily generated by 

65 those of ordinary skill in the art based on the genetic code. 
Counterpart RNA sequences can be generated by substitu- 
tion of U for T. 
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Those skilled in the art will readily recognize that, in view 
of the degeneracy of the genetic code, considerable sequence 
variation is possible among these polynucleotide molecules. 
SEQ ID N0:3 is a degenerate DNA sequence that encom- 
passes all DNAs that encode the zpeplOpolypeptide of SEQ 
ID N0:2. Those skilled in the art will recognize that the 
degenerate sequence of SEQ ID N0:3 also provides all RNA 
sequences encoding SEQ ID NO: 2 by substituting U for T 
Thus, zpeplO polypeptide-encoding polynucleotides com- 
prising nucleotide 1 to nucleotide 426 of SEQ ID N0:3 and 
their RNA equivalents arc contemplated by the present 
invention. Table 1 sets forth the one-letter codes used within 
SEQ ID NO: 3 to denote degenerate nucleotide positions. 
"Resolutions" are the nucleotides denoted by a code letter. 
"Complement" indicates the code for the complementary 
nucleotide(s). For example, the code Y denotes either C or 
T, and its complement R denotes A or G, A being comple- 
mentary to T, and G being complementary to C. 

TABLE 1 



Nucleotide 


Resolution 


Complement 


Nucleotide 


A 


A 


T 


T 


C 


C 


G 


G 


6 


G 


C 


C 


T 


T 


A 


A 


R 


A|G 


y 


c|t 


y 


C|T 


R 


A|G 


M 


A|C 


K 


g|t 


K 


G|T 


M 


a|c 


S 


C|G 


s 


c|g 


W 


A|T 


w 


a|t 


H 


a|c|t 


D 


AjGjT 


B 


c|g|t 


V 


A|C jG 


V 


a|c|g 


B 


cIgIt 


D 


a|g|t 


H 


AjcjT 


N 


a|c|g|t 


N 


a|c|g|t 



TABLE 2 



Amino 
Acid 



One 
Letter 
Code Codons 



Degenerate 
Codon 



Cys 


C 


TGC 


TGT 










TGY 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TOG 


TOT 


WSN 


Thr 


T 


ACA 


ACC 


AGG 


ACT 






ACN 


Pro 


P 


CCA 


ccc 


CCG 


OCT 






CCN 


Ala 


A 


GCA 


GCC 


GCG 


GCT 






GCN 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






GGN 


Asn 


N 


AAC 


AAT 










AAY 


Asp 


D 


GAG 


GAT 










GAY 


Glu 


E 


GAA 


GAG 










GAR 


Gin 


Q 


CAA 


CAG 










CAR 


His 


H 


CAC 


CAT 










CAY 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


MGM 


Lys 


K 


AAA 


AAG 










AAR 


Met 


M 


ATG 












ATG 


lie 


I 


ATA 


ATC 


ATT 








ATH 


Leu 


L 


CTA 


CTC 


CTG 


CTT 


TTA 


TTG 


YTN 


Val 


V 


GTA 


GTC 


GTG 


GTT 






GTN 


Phe 


F 


TTC 


TTT 










TTY 


Tyr 


Y 


TAG 


TAT 










TAY 


Trp 


W 


TGG 












TGG 


Ter 




TAA 


TAG 


TGA 








TRR 


Asnj Asp 


B 














RAY 


Glu 1 Gin 


Z 














SAR 


Any 


X 














NNN 



10 



15 



25 



35 



The degenerate codons used in SEQ ID N0:3, encom- 
passing all possible codons for a given amino acid, are set 
forth in Table 2. 



40 



45 



5U 



55 



60 



65 



One of ordinary skill in the art will appreciate that some 
ambiguity is introduced in determining a degenerate codon, 



representative of all possible codons encoding each amino 
acid. For example, the degenerate codon for serine (WSN) 
can, in some circumstances, encode arginine (AGR), and the 
degenerate codon for arginine (MGN) can, in some 
circumstances, encode serine (AGY). A similar relationship 
exists between codons encoding phenylalanine and leucine. 
Ill us, some polynucleotides encompassed by the degenerate 
sequence may encode variant amino acid sequences, but one 
of ordinary skill in the art can easily identify such variant 
sequences by reference to the amino acid sequence of SEQ 
ID NO: 2. Variant sequences can be readily tested for func- 
tionality as described herein. 

One of ordinary skill in the art will also appreciate that 
different species can exhibit "preferential codon usage." In 
general, see, Grantham, et al., Nua Acids Res. 8:1893-912, 
1980; Haas, et al. Curr. Biol. 6:315-24, 1996; Wain-Hobson, 
et al., Gene 13:355-64, 1981; Grosjean and Fiers, Gene 
18:199-209, 1982; Holm, Nuc, Acids Res. 14:3075^7, 
1986; Ikemura, J. Mol. Biol. 158:573-97, 1982. As used 
herein, the term "preferential ccxlon usage" or "preferential 
codons" is a term of art referring to protein translation 
codons that are most frequently used in cells of a certain 
species, thus favoring one or a few representatives of the 
possible codons encoding each amino acid (See Table 2). 
For example, the amino acid threonine (Thr) may be 
encoded by ACA, ACC, ACG, or ACT, but in mammalian 
cells ACC is the most commonly used codon; in other 
species, for example, insect cells, yeast, viruses or bacteria, 
different Thr codons may be preferential. Preferential 
codons for a particular species can be introduced into the 
polynucleotides of the present invention by a variety of 
methods known in the art. Introduction of preferential codon 
sequences into recombinant DNA can, for example, enhance 
production of the protein by making protein translation more 
efficient within a particular cell type or species. Therefore, 
the degenerate codon sequence disclosed in SEQ ID N0:3 
serves as a template for optimizing expression of polynucle- 
otides in various cell types and species commonly used in 
the art and disclosed herein. Sequences containing prefer- 
ential codons can be tested and optimized for expression in 
various species, and tested for functionality as disclosed 
herein. 

Within preferred embodiments of the invention the iso- 
lated polynucleotides will hybridize to similar sized regions 
of SEQ ID N0:1, other polynucleotide probes, primens, 
fragments and sequences recited herein or sequences 
complementary thereto. Polynucleotide hybridization is well 
known in the art and widely used for many applications, see 
for example, Sambrook et al.. Molecular Cloning: A Labo- 
ratory Manual, Second Edition, Cold Spring Harbor, N.Y., 
1989; Ausubel et aL, eds.. Current Protocols in Molecular 
Biology, John Wiley and Sons, Inc., NY, 1987; Bcrgcr and 
Kimmel, eds., Guide to Molecular Cloning Techniques, 
Methods in Enzymology, volume 152, 1987 and Wetmur, 
Crit, Rev. Diochem. Mot. Biol 26:227-59, 1990. Polynucle- 
otide hybridization exploits the ability of single stranded 
complementary sequences to form a double helix hybrid. 
Such hybrids include DNA-DNA, RNA-RNA and DNA- 
RNA. 

Hybridization will occur between sequences which con- 
tain some degree of complementarity. Hybrids can tolerate 
mismatched base pairs in the double helix, but the stability 
of the hybrid is influenced by the degree of mismatch. The 
T„ of the mismatched hybrid decreases by 1° C. for every 
1-1.5% base pair mismatch. Varying the stringency of the 
hybridization conditions allows control over the degree of 
mismatch that will be present in the hybrid. The degree of 
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Stringency increases as the hybridization temperature 
increases and the ionic strength of the hybridization buffer 
decreases. Stringent hybridization conditions encompass 
temperatures of about 5-25*^ C. below the thermal melting 
point (T„,) of the hybrid and a hybridization buffer having up 
to 1 M Na*. Higher degrees of stringency at lower tempera- 
tures can be achieved with the addition of form amide which 
reduces the of the hybrid about 1° C. for each 1% 
formamide in the buffer solution. Generally, such stringent 
conditions include temperatures of 20-70^* C. and a hybrid- 
ization buffer containing up to 6x SSC and 0-50% forma- 
mide. A higher degree of stringency can be achieved at 
temperatures of from 40-70° C. with a hybridization buffer 
having up to 4x SSC and from 0-50% formamide. Highly 
stringent conditions typically encompass temperatures of 
42-70** C. with a hybridization buffer having up to Ix SSC 
and 0-50% formamide. Different degrees of stringency can 
be used during hybridization and washing to achieve maxi- 
mum specific binding to the target sequence. IVpi^^lly* the 
washes following hybridization are performed at increasing 
degrees of stringency to remove non-hybridized polynucle- 
otide probes from hybridized complexes. 

The above conditions are meant to serve as a guide and it 
is well within the abilities of one skilled in the art to adapt 
these conditions for use with a particular polypeptide hybrid. 
The T^ for a specific target sequence is the temperature 
(under defined conditions) at which 50% of the target 
sequence will hybridize to a perfectly matched probe 
sequence. Those conditions which influence the T„ include, 
the size and base pair content of the polynucleotide probe, 
the ionic strength of the hybridization solution, and the 
presence of destabilizing agents in the hybridization solu- 
tion. Numerous equations for calculating T„, are known in 
the art, see for example (Sambrook et al., ibid.; Ausubel et 
al., ibid.; Berger and Kimmel, ibid, and Wetmur, ibid.) and 
are specific for DNA, RNA and DNA-RNA hybrids and 
polynucleotide probe sequences of varying length. Sequence 
analysts software such as Oligo 4.0 (publicly available 
shareware) and Primer Premier (PREMIER Biosoft 
International, Palo Alto, Calif.) as well as sites on the 
Internet, are available tools for analyzing a given sequence 
and calculating T^ based on user defined criteria. Such 
programs can also analyze a given sequence under defined 
conditions and suggest suitable probe sequences. Typically, 
hybridization of longer polynucleotide sequences, >50 bp, is 
done at temperatures of about 20-25° C. below the calcu- 
lated r^. For smaller probes, <50 bp, hybridization is 
typically carried out at the T„, or 5-10'* C. below. This allows 
for the maximum rate of hybridization for DNA-DNA and 
DNA-RNA hybrids. 

The length of the polynucleotide sequence influences the 
rate and stability of hybrid formation. Smaller probe 
sequences, <50 bp, come to equilibrium with complemen- 
tary sequences rapidly, but may form less stable hybrids, 
incubation times of anywhere from minutes to hours can be 
used to achieve hybrid formation. Longer probe sequences 
come to equilibrium more slowly, but form more stable 
complexes even at lower temperatures. Incubations are 
allowed to proceed overnight or longer. Generally, incuba- 
tions are carried out lor a period equal to three times the 
calculated Cot lime. Cot time, the time it lakes lor the 
polynucleotide sequences to reassociate, can be calculated 
for a particular sequence by methods known in the art. 

The base pair composition of polynucleotide sequence 
will effect the thermal stability of the hybrid complex, 
thereby influencing the choice of hybridization temperature 
and the ionic strength of the hybridization buffer. A-T pairs 
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are less stable than G-C pairs in aqueous solutions contain- 
ing NaCl. Therefor, the higher the G-C content, the more 
stable the hybrid. Even distribution of G and C residues 
within the sequence also contribute positively to hybrid 

5 stability. Base pair composition can be manipulated to alter 
the T„ of a given sequence, for example, 
5-methyideoxycytidine can be substituted for deoxycytmi- 
dine and 5-bromodeoxuride can be substituted for thymidine 
to increase the T„. 7-deazz-2'-deoxyguanosine can be sub- 
stituted for guanosine to reduce dependence on T,„. 

Ionic concentration of the hybridization buffer also effects 
the stability of the hybrid. Hybridization buffers generally 
contain blocking agents such as Denhardt's solution (Sigma 
Chemical Co., St. Louis, Mo.), denatured salmon sperm 
DNA, tRNA, milk powders (BLOTTO), heparin or SDS, 

15 and a Na* source, such as SSC (Ix SSC: 0.15 M NaCl, 15 
mM sodium citrate) or SSPE (Ix SSPE: 1.8 M NaCI, 10 mM 
NaH2P04, 1 mM EDTA, pll 7.7), By decreasing the ionic 
concentration of the buffer, the stability of the hybrid is 
increased. Typically, hybridization buffers contain from 

20 between 10 mM-1 M Na*. Premixed hybridization solutions 
are also available from commercial sources such as Clontech 
Laboratories (Palo Alto, Calif.) and Promega Corporation 
(Madison, Wis.) for use according to manufacturer's instruc- 
tion. Addition of destabilizing or denaturing agents such as 

25 formamide, tetralkylammonium salts, guanidinium cations 
or thiocyaaate cations to the hybridization solution will alter 
the T^ of a hybrid. Typically, formamide is used at a 
concentration of up to 50% to allow incubations to be carried 
out at more convenient and lower temperatures. Formamide 

30 also acts to reduce non-specific background when using 
RNA probes. 

As previously noted, the isolated zpeplOpolynucleotides 
of the present invention include DNA and RNA. Methods 
for isolating DNA and RNA are well known in the art. It is 

35 generally preferred to isolate RNA from lymph node, 
although DNA can also be prepared using RNA from other 
tissues or isolated as genomic DNA. Total RNA can be 
prepared using guanidine HCl extraction followed by iso- 
lation by centrifugation in a CsCl gradient (Chirgwin et al., 

40 Biochemistry 18:52-94, 1979). Poly (A)* RNA is prepared 
from total RNA using the method of Aviv and Leder (Proc. 
Natl Acad. ScL USA 69:1408-12, 1972). Complementary 
DNA(cDNA) is prepared from poly(A)* RNA using known 
methods. Polynucleotides encoding zpeplOpolypeptides are 

45 then identified and isolated by, for example, hybridization or 
PCR. 

1lie polynucleotides of the present invention can also be 
synthesized using automated equipment. The current 
method of choice is the phosphoramidite method. If chemi- 

50 cally synthesized double stranded DNA is required for an 
application such as the synthesis of a gene or a gene 
fragment, then each complementary strand is made sepa- 
rately. The production of short genes (60 to 80 bp) is 
technically straightforward and can be accomplished by 

55 synthesizing the complementary strands and then annealing 
them. For the production of longer genes (>300 bp), 
however, special strategies must be invoked, because the 
coupling efliciency of each cycle during chemical DNA 
synthesis is seldom 100%, To overcome this problem, syn- 

60 ihetic genes (double-stranded) are assembled in modular 
form from single-stranded fragments that are Irom 20 to 100 
nucleotides in length. Gene synthesis methods are well 
known in the art. See, for example, Click and Pa.sternak, 
Molecular Biotechnology, Principles & Applications of 

65 Recombinant DNA, ASM Press, Washington, D.C., 1994; 
Itakura et aUAnniL Rev. Biochem. 53: 323-356, 1984; and 
Climie et al., Proc. Natl Acad. Sci. USA 87:633-637, 1990. 
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The zpeplO polynucleotide sequences disclosed herein 
can be used to isolate polynucleotides encoding other 
zpeplO proteins. Such other proteins include alternatively 
spliced cDNAs (including cDNAs encoding secreted zpeplO 
proteins) and counterpart polynucleotides from other species 
(orthologs). ITiese orthologous polynucleotides can be used, 
inter alia, to prepare the respective t)rlhol()gous proteins. 
Other species of interest include, but are not limited to, 
mammalian, avian, amphibian, reptile, fish, insect and other 
vertebrate and invertebrate species. Of particular interest are 
ZpeplO polynucleotides and proteins from other mammalian 
species, including human and other primates, porcine, ovine, 
bovine, canine, feUne, and equine polynucleotides and pro- 
teins. Orthologs of mouse zpeplO, for example, can be 
cloned using information and compositions provided by the 
present invention in combination with conventional cloning 
techniques. For example, a cDNA can be cloned using 
mRNA obtained from a tissue or cell type that expresses 
ZpeplO as disclosed herein. Suitable sources of mRNA can 
be identified by probing Northern blots with probes designed 
from the sequences disclosed herein. A library is then 
prepared from mRNA of a positive tissue or cell line. A 
zpeplO-encoding cDNA can then be isolated by a variety of 
methods, such as by probing with a complete or partial 
human cDNAor with one or more sets of degenerate probes 
based on the disclosed sequences. A cDNA can also be 
cloned using the polymerase chain reaction, or PCR (Mullis, 
U.S. Pat. No. 4,683,202), using primers designed from the 
representative human zpeplO sequence disclosed herein. 
Within an additional method, the cDNA library can be used 
to transform or transfect host cells, and expression of the 
cDNAof interest can be detected with an antibody to zpeplO 
polypeptide. Similar techniques can also be appHed to the 
isolation of genomic clones. Electronic databases can also 
be screened for EST sequences of zpeplO orthologs. Degen- 
erate polynucleotide primer sequences useful for identifying 
ZpeplO orthologs would include: 



zpeplO residues 15-2 0 of SEQ ID NO: 2 

CARGCNTGYGTNTTYTG (SEQ ID NO: 4) 

zpeplO residues 42-47 of SEQ ID NO: 2 

CARAARGARTGYGGNGC (SEQ ID NO: 5) 

zpeplO residues 61-66 of SEQ ID N0:2 

ATGAAYAARGRNACNGA (SEQ ID NO: 6) 

zpeplO residues 64-69 of SEQ ID NO: 2 

GRNACNGARAARACNCA (SEQ ID NO: 7) 

zpeplO residues 86-91 of SEQ ID NO: 2 

ACNTGYAARGGNACNGA (SEQ ID N0:8). 
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20 



25 



40 



Those skilled in the art will recognize that the sequences 
disclosed in SEQ ID N0:1 and SEQ ID N0:2 represent a 
single allele of the human zpeplO gene and polypeptide, and 
that allelic variation and alternative splicing arc expected to 
occur. In addition, allelic variants can be cloned by probing 
cDNA or genomic libraries from different individuals 
according to standard procedures. Allelic variants of the 
DNA sequence shown in SEQ ID N0:1, including those 
containing silent mutations and those in which mutations 
result in amino acid sequence changes, are within the scope 
of the present invention, as are proteins which arc allelic 
variants of SEQ ID NO:2. cDNAs generated from alterna- 
tively spliced mRNAs, which retain the properties of the 
ZpeplO polypeptide are included within the scope of the 
present invention, as are polypeptides encoded by such 
cDNAs and mRNAs. Allelic variants and splice variants of 
these sequences can be cloned by probing cDNA or. 
genomic libraries from different individuals or tissues 
according to standard procedures known in the art. 

The present invention also provides isolated zpeplO 
polypeptides that are substantially homologous to the 
polypeptide of SEQ ID N0:2 and its species orthologs. The 
term "substantially homologous" is used herein to denote 
polypeptides having 60%, preferably at least 80%, sequence 
identity to the sequences shown in SEQ ID N0:2 or their 
orthologs. Such polypeptides will more preferably be at least 
90% identical, and most preferably 95% or more identical to 
SEQ ID NO: 2 or its orthologs. Percent sequence identity is 
determined by conventional methods. Sec, for example, 
Altschul et al., Bull. Math. Bio. 48 : 603-16, 1986 and 
HenikolT and Henikolf, Proc. Natl. Acad. ScL USA 
89:10915-9, 1992. Briefly, two amino acid sequences are 
aligned to optimize the alignment scores using a gap open- 
ing penalty of 10, a gap extension penalty of 1, and the 
"blosum 62" scoring matrix of HenikofiPand Henikoff (ibid.) 
as shown in Table 3 (amino acids are indicated by the 
standard one-letter codes). ITie percent identity is then 
calculated as: 



45 



Total number of identical matches 
[length of the longer sequence plus the 
number of gaps introduced into the longer 
sequence in order to align the two sequences] 



xlOO 



TABLE 3 





A 


R 


N 


D 


c 


Q 


E 


G 


H 


I 


L 


K 


A 


4 
























R 
N 


-1 
-2 


5 
0 


6 




















D 


-2 


-2 


1 


6 


















C 


0 


-3 


-3 


-3 


9 
















Q 


-1 


1 


0 


0 


-3 


5 














li 


-1 


0 


0 


2 


-4 


2 


5 












G 


0 


-2 


0 


-1 


-3 


-2 


_2 


C 










H 


-2 


0 


1 


-1 


-3 


0 


0 


-2 


8 








I 


-1 


-3 


-3 


-3 


-1 


-3 


-3 


-4 


-3 


4 






L 


-1 


-2 


-3 


-4 


-1 


-2 


-3 


-4 


-3 


2 


4 




K 


-1 


2 


0 


-1 


-3 


1 


1 


-2 


-1 


-3 


-2 


5 


M 


-1 


-1 


-2 


-3 


-1 


0 


-2 


-3 


-2 


1 


2 


-1 


F 


-2 


-3 


-3 


-3 


-2 


-3 


-3 


-3 


-1 


0 


0 


-3 


P 


-1 


-2 


-2 


-:i 


-3 


-1 


-1 


-2 


-2 


-3 


-3 


-:i 


S 


1 


-1 


1 


0 


-1 


0 


0 


0 


-1 


-2 


-2 


0 
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TABLE 3-continued 

ARNDCQEGHI LKMFPS TWYV 

T 0-1 0-1-1 -1 -1 -2 -2 -1 -1 -1 -1 -2-1 1 5 

W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -J i -4 -3 -2 11 

y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -J -2 -1 3-3-2-2 2 7 

V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 



Sequence identity of polynucleotide molecules is deter- 
mined by similar methods using a ratio as disclosed above. 

Those skilled in the art appreciate that there arc many 
established algorithms available to align two amino acid 
sequences. The "FASTA" similarity search algorithm of 
Pearson and Lipman is a suitable protein alignment method 
for examining the level of identity shared by an amino acid 
sequence disclosed herein and the amino acid sequence of a 
putative variant zpcplO. The FASTA algorithm is described 
by Pearson and Lipman, Proc. Nat, Acad. Sci. USA 85:2444, 
1988, and by Pearson, Metfi. EnzymoL 183:63, 1990. 

Briefly, FASTA first characterizes sequence similarity by 
identifying regions shared by the query sequence (e.g., SEQ 
ID NO: 2) and a test sequence that have either the highest 
density of identities (if the ktup variable is 1) or pairs of 
identities (if ktup=2), without considering conservative 25 
amino acid substitutions, insertions, or deletions. 71ie ten 
regions with the highest density of identities are then 
re-scored by comparing the similarity of all paired amino 
acids using an amino acid substitution matrix, and the ends 
of the regions are "trimmed" to include only those residues 30 
that contribute to the highest score. If there are several 
regions with scores greater than the "cutoff" value 
(calculated by a predetermined formula based upon the 
length of the sequence and the ktup value), then the trimmed 
initial regions are examined to determine whether the 35 
regions can be joined to form an approximate alignment 
with gaps. Finally, the highest scoring regions of the two 
amino acid sequences are aligned using a modification of the 
Needleman-Wunsch-Sellers algorithm (Needleman and 
Wunsch, y. Mol Biol. 48:444, 1970; SeUers, SIAMJ. Appi 40 
Math. Id'.lSl, 1974), which allows for amino acid insertions 
and deletions. Preferred parameters for FASTA analysis are: 
ktupsl, gap opening penaUy=10, gap extension penalty=l, 
and substitution matrix»BLOSUM62. These parameters can 
be introduced into a FASTA program by modifying the 45 
scoring matrix file ("SMATRIX"), as explained in Appendix 
2 of Pearson, Meilu EnzymoL 183:63, 1990. 

FASTA can also be used to determine the sequence 
identity of nucleic acid molecules using a ratio as disclosed 
above. For nucleotide sequence comparisons, the ktup value so 
can range between one to six, preferably from three to six, 
most preferably three, with other parameters set as default. 

The DLOSUM62 table is an amino acid substitution 
matrix derived from about 2,000 local multiple alignments 
of protein sequence segments, representing highly con- 55 
served regions of more than 500 groups of related proteins 
(Henikoflf and Henikofif, Proc. Natl. Acad. Sci. USA 
89:10915, 1992). Accordingly, the BLOSUM62 substitution 
frequencies can be used to define conservative amino acid , 
substitutions that may be introduced into the amino acid 60 
sequences of the present invention. Although it is possible to 
design amino acid substitutions based solely upon chemical 
properties (as discussed above), the language "conservative 
amino acid substitution" preferably refers to a substitution 
represented by a BLOSUM62 value of greater than -1. For 65 
example, an amino acid substitution is conservative if the 
substitution is characterized by a BLOSUM62 value of 0, 1, 



2, or 3. According to this system, preferred conservative 
amino acid substitutions are characterized by a BLOSUM62 
value of at least 1 (e.g., 1, 2 or 3), while more preferred 
conservative amino acid substitutions are characterized by a 
BLOSUM62 value of at least 2 (e.g., 2 or 3). 

Substantially homologous proteins and polypeptides are 
characterized as having one or more amino acid 
substitutions, deletions or additions. Tliese changes are 
preferably of a minor nature, that is conservative amino acid 
substitutions (see Table 4) and other substitutions that do not 
significantly affect the folding or activity of the protein or 
polypeptide; small deletions, typically of one to about 30 
amino acids; and small amino- or carboxyl-terminal 
extensions, such as an amino-tcrminal methionine residue, a 
small linker peptide of up to about 20-25 residues or an 
afGnity tag. Polypeptides comprising affinity tags can further 
comprise a proteolytic cleavage site between the zpeplO 
polypeptide and the affinity tag. Preferred such sites include 
thrombin cleavage sites and factor Xa cleavage sites. 

TABLE 4 



Conservative amino acid substitutions 



Basic: 


argininc 




lysine 




histidinc 


Acidic: 


glutamic acid 




aspartic acid 


Polar: 


glutamine 




asparagine 


Hydrophobic: 


leucine 




isoleucine 




valine 


Aromatic: 


phenylalanine 




tryptophan 




tyrosine 


Small: 


glycine 




alanine 




serine 




threonine 




methionine 



The proteins of the present invention can also comprise 
non-naturally occurring amino acid residues. Non-naturally 
occurring amino acids include, without limitation, trans-3- 
methylproline, 2,4-methanoproline, cis-4-hydroxyproline, 
trans-4-hydroxyproline, N-methyl-glycine, allo-threonine, 
methylthreonine, hydroxyethyl-cysteine, 
hydroxyethylhomocysteine, nitroglutaraine, homo- 
glutamine, pipecolic acid, thiazolidine carboxylic acid, 
dehydroproline, 3- and 4-methylproline, 3,3-dimethyl- 
proline, tert-leucine, norvaline, 2-azaphenyl-alanine, 

3- azapheny lalanine, 4-azapheny lalanine, and 

4- llu()rophenyl-alanine. Several methods are known in the 
art for incorporating non-naturally occurring amino acid 
residues into proteins. For example, an in vitro system can 
be employed wherein nonsense mutations are suppressed 
using chemically aminoacylated suppressor tRNAs. Meth- 
ods for synthesizing amino acids and aminoacylating tRNA 
are known in the art. Transcription and translation of plas- 
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mids containing nonsense mutations is carried out in a 241:53-57, 1988) or Bowie and Sauer (Proc, Natl. Acad. 

cell-free system comprising an E. coH S30 extract and ScL USA 86:2152-2156, 1989). Briefly, these authors dis- 

commercially available enzymes and other reagents. Pro- close methods for simultaneously randomizing two or more 

teins are purified by chromatography. See, for example, positions in a polypeptide, selecting for functional 

Robertson et al., J. Am. Chem. Soc. 113:2722, 1991; Ellman 5 polypeptide, and then sequencing the mutagenizcd polypep- 

et al., Methods EnzymoL 202:301, 1991; Chung et al, tides to determine the spectrum of allowable substitutions at 

Science 259:806-9, 1993; and Chung et al., i^roc. NalL each position. Other methods that can be used include phage 

Acad. ScL USA 90:10145-9, 1993). In a second method, display (e.g., Lowman et al., Biochem. 30:10832-10837, 

translation is carried out in Xenopus oocytes by microin- 1991; Ladner et al., U.S. Pat. No. 5,223,409; Huse, WIPO 

jection of mutated mRNA and chemically aminoacylated lo Publication WO 92/06204) and region-directed mutagenesis 

suppressor tRNAs (Turcatti et al., 7. Biol. Chem. (Derbyshire et al., Gene 46:145, 1986; Ner et al., DNA 

271:19991-^, 1996). Within a third method, E. coli cells are 7:127, 1988). 

cultured in the absence of a natural amino acid that is to be Variants of the disclosed zpeplO DNA and polypeptide 
replaced (e.g., phenylalanine) and in the presence of the sequences can be generated through DNA shullling as 
desired non-naturally occurring amino acid(s) (e.g., 15 disclosed by Stemmer,A^flrwre 370:389-91, 1994, Stemmer, 
2-azaphenylalanine, 3 -azaphenylalanine , Proc. Natl Acad Sci. USA 91:10747-51, 1994 and WIPO 
4-azaphenylalanine, or 4-fluorophenylalanine). The non- Publication WO 97/20078. Briefly, variant DNAs are gen- 
naturally occurring amino acid is incorporated into the erated by in vitro homologous recombination by random 
protein in place of its natural counterpart. See, Koide et al., fragmentation of a parent DNA followed by reassembly 
Biocliem. 33:7470-6, 1994. Naturally occurring amino acid 20 using PCR, resulting in randomly introduced point muta- 
residues can be converted to non-naturally occurring species tions. This technique can be modified by using a family of 
by in vitro chemical modification. Chemical modification parent DNAs, such as allelic variants or DNAs from differ- 
can be combined with site-directed mutagenesis to further ent species, to introduce additional variability into the pro- 
expand the range of substitutions (Wynn and Richards, cess. Selection or screening for the desired activity, followed 
Protein Sci. 2:39-403, 1993). 25 by additional iterations of mutagenesis and assay provides 
A limited number of non-conservative amino acids, amino for rapid "evolution" of sequences by selecting for desirable 
acids that are not encoded by the genetic code, non-naturally mutations while simultaneoiLsly selecting against detrimen- 
occurring amino acids, and unnatural amino acids may be tal changes. 

substituted for zpeplO amino acid residues. Mutagenesis methods as disclosed above can be com- 

Essential amino acids in the zpeplO polypeptides of the 30 bined with high-throughput, automated screening methods 

present invention can be identified according to procedures to detect activity of cloned, mutagenizcd polypeptides in 

known in the art, such as site-directed mutagenesis or host cells. Mutagenizcd DNA molecules that encode active 

alanine-scanning mutagenesis (Cunningham and Wells, 6a- polypeptides (e.g., ligand binding receptors) can be recov- 

ence 244: 1081-5, 1989). In the latter technique, single e red from the host cells and rapidly sequenced using modem 

alanine mutations are introduced at every residue in the 35 equipment. These methods allow the rapid determination of 

molecule, and the resuhant mutant molecules are tested for the importance of individual amino acid residues in a 

biological activity (e.g., adhesion-modulation, polypeptide of interest, and can be appUed to polypeptides 

differentiation-modulation or the hkc) to identify amino acid of unknown structure. 

residues that are critical to the activity of the molecule. See Using the methods discussed above, one of ordinary skill 
also, Hilton et al., J. Biol Chem. 271:4699-708, 1996. Sites 40 in the art can identify and/or prepare a variety of polypep- 
ofligand-receptor or other biological interaction can also be tides that are substantially homologous lo, for example, 
determined by physical analysis of structure, as determined residues 21 to 111, 21 to 142 or 1 to 142 of SEQ ID N0:2 
by such techniques as nuclear magnetic resonance, or allelic variants thereof and retain the properties of wild- 
crystallography, electron diffraction or photoaffinity type protein. Such polypeptides may include additional 
labeling, in conjunction with mutation of putative contact 45 amino acids, such as afiBnity tags and the like. Such polypep- 
site amino acids. See, for example, de Vbs et al.. Science tides may also include additional polypeptide segments as 
255:306-12, 1992; Smith et al., ./. Moi Hioi 224:899-904, generally disclosed herein. 

1992; Wlodaver et al., FEES Lett. 309:59-64, 1992. The The invention also provides soluble polypeptides. It is 
identities of essential amino acids can also be inferred from preferred that these soluble polypeptides be extracellular 
analysis of homologies with related proteins. Amino acid 50 polypeptides and be in a fonm substantially free of trans- 
residues that might be considered essential in the zpeplO membrane and intracellular polypeptide segments. To direct 
polypeptide are cysteine residues at amino acid residues 17, the export of the soluble polypeptides from the host cell, the 
20, 35, 45, 84, 87, 94 and 100 of SEQ ID N0:2; the potential DNA encoding the soluble polypeptide is linked to a second 
arg-lys-arg tri-basic amino acid cleavage site at amino acid DNA segment encoding a secretory peptide, such as a t-PA 
residues 97-99 of SEQ ID N0:2; the potential N-linked 55 secretory peptide or the native zpeplO secretory signal 
glycosylation sites at amino acid residues 83 and 86 of SEQ sequence (amino acid residues 1-20 of SEQ ID N0:2). To 
ID N0:2 and the potential O-glycosylation sites at amino facilitate purification of the secreted polypeptide, an N- or 
acid residues 28, 36, 48, 52, 60, 65, 68, 78, 79, 80, 85, 86, C-terminal extension, such as an afiBnity tag or another 
90, 93 and 104 of SEQ ID N0:2. A hydrophobicity profile polypeptide or protein for which an antibody or other 
ol SEQ ID N0:2 is shown in the attached FIGURE. Those 60 specific binding agent is available, can be fused to the 
skilled in the art will recognize that this hydrophobicity will .soluble polypeptide. 

be taken into account when designing alterations in the The present invention also provides zpeplOfusion pro- 

amino acid sequence of a zpeplO polypeptide, so as not to teins. For example, fusion proteins of the present invention 

disrupt the overall profile. encompass 

Multiple amino acid substitutions can be made and tested 65 (1) a polypeptide selected from the following: a) a 

using known methods of mutagenesis and screening, such as polypeptide comprising a sequence of amino acid rcsi- 

those disclosed by Reidhaar-Olson and Sauer {Science dues from amino acid residue 21 to amino acid residue 
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ill of SEQ ID N0:2; and b) a polypeptide comprising 
a sequence of amino acid residues from amino acid 
residue 1 to amino acid residue 20 of SEQ ID N0:2; 
and 

(2) another polypeptide. The other polypeptide may be a 
signal peptide to facilitate secretion of the fusion 
protein, a transmembrane and/or cytoplasmic domain, 
or another soluble polypeptide or the Mke. For example, 
the extracellular portion of a zpeplO polypeptide can be 
prepared as a fusion to a dimerizing protein as dis- 
closed in U.S. Pat. Nos. 5,155,027 and 5,567,584. 
Preferred dimerizing proteins in this regard include 
immunoglobulin constant region domains. 
Immunoglobulin-zpeplO polypeptide fusions can be 
expressed in genetically engineered cells to produce a 
variety of multimeric zpeplO analogs. Auxiliary 
domains can be fused to zpeplO polypeptides to target 
them to specific cells, tissues, or macromolecules. For 
example, a soluble zpeplO polypeptide or protein could 
be targeted to a predetermined cell type by fusing a 
ZpeplO polypeptide to a ligand that specifically binds to 
a receptor on the surface of the target cell. In this way, 
polypeptides and proteins can be targeted for therapeu- 
tic or diagnostic purposes. AzpeplO polypeptide can be 
fused to two or more moieties, such as an afiSnity tag for 
purification and a targeting domain. Polypeptide 
fusions can also comprise one or more cleavage sites, 
particularly between domains. See, Tuan et aL, Con- 
nective Tissue Research 34:1-9, 1996. 
The soluble zpeplO polypeptide is useful in studying the 
distribution of zpeplO receptors on tissues or specific cell 
lineages, and to provide insight into receplor/Iigand biology. 
Using labeled soluble zpeplO, cells expressing the ligand are 
identified by fluorescence immunocytometry or immunohis- 
tochemi.stry. The effects of zpeplO on steroidogenesis or 
Leydig or Sertoli cell expression can be examined by 
probing tissue slices with soluble zpeplO fusions, sec for 
example, Daehlin et al., Scand. J. Urol. Nephrol. 19:7-12, 
1985; Gavino et al.. Arch. Biochem. Biophys. 233:741-7, 
1984 and von Schnakenburg et al.. Acta Endocrinol 
94:397-403, 1980). luteinizing hormone (LH) and follicle 
stimulating hormone (FSH) responses could also be exam- 
ined in soluble zpeplO-treated tissue slices. 

Tlie polypeptides of the present invention, including 
full-length proteins, fragments thereof and fusion proteins, 
can be produced in genetically engineered host cells accord- 
ing to conventional techniques. Suitable host cells are those 
cell types that can be transformed or transfected with 
exogenous DNAand grown in culture, and include bacteria, 
fungal cells, and cultured higher eukaryotic cells. Eukaryotic 
cel^, particularly cultured cells of multicellular organisms, 
are preferred. Techniques for manipulating cloned DNA 
molecules and introducing exogenous DNA into a variety of 
host cells are disclosed by Sambrook et al, Molecular 
Cloning: A Laboratory Manual 2nd ed., Cold Spring Har- 
bor Laboratory Press, Cold Spring Harbor, N.Y., 1989, and 
Ausubel et al. (eds.). Current Protocols in Molecular 
Biology, John Wiley and Sons, Inc., NY, 1987. 

In general, a DNA sequence encoding a zpeplO polypep- 
tide of the present invention is operably linked to other 
genetic elements required for its expression, generally 
including a transcription promoter and terminator within an 
expression vector. The vector will also commonly contain 
one or more selectable markers and one or more origins of 
replication, although those skilled in the art will recognize 
that within certain systems selectable markers may be pro- 
vided on separate vectors, and replication of the exogenous 
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DNA may be provided by integration into the host cell 
genome. Selection of promoters, terminators, selectable 
markers, vectors and other elements is a matter of routine 
design within the level of ordinary skill in the art. Many such 

5 elements are described in the literature and are available 
through commercial suppliers. 

To direct a zpeplO polypeptide into the secretory pathway 
of a host cell, a secretory signal sequence (also known as a 
signal sequence, leader sequence, prepro sequence or pre 

10 sequence) is provided in the expression vector. The secretory 
signal sequence may be that of the zpeplO polypeptide, or 
may be derived from another secreted protein (e.g., t-PA) or 
synthesized de novo. The secretory signal sequence is joined 
to the ZpeplO DNA sequence in the correct reading frame 

15 and positioned to direct newly synthesized polypeptide into 
secretory pathways to host cell. Secretory signal sequences 
are commonly positioned 5' to the DNA sequence encoding 
the polypeptide of interest, although certain secretory signal 
sequences may be positioned elsewhere in the DNA 

20 sequence of interest (see, e.g., Welch et al., U.S. Pat. No. 
5,037,743; Holland et al., U.S. Pat. No. 5,143,830). 

Alternatively, the secretory signal sequence contained in 
the polypeptides of the present invention is used to direct 
other polypeptides into the secretory pathway. The present 

25 invention provides for such fusion polypeptides. A signal 
fiision polypeptide can be made wherein a secretory signal 
sequence derived from amino acid residues 1-20 of SEO ID 
NO:2 is be operably linked to another polypeptide using 
methods known in the art and disclosed herein. The secre- 

30 tory signal sequence contained in the fusion polypeptides of 
the present invention is preferably fused amino-terminally to 
an additional peptide to direct the additional peptide into the 
secretory pathway. Such constructs have numerous applica- 
tions known in the art. For example, these novel secretory 

35 signal sequence fusion constmcts can direct the secretion of 
an active component of a normally non-secreted protein, 
such as a receptor. Such fusions may be used in vivo or in 
vitro to direct peptides through the secretory pathway. 
Cultured mammahan cells are suitable hosts within the 

40 present invention. Methods for introducing exogenous DNA 
into mammalian host cells include calcium phosphate- 
mediated transfection (Wigler et al. Cell 14:725, 1978; 
Corsaro and Pearson, Somatic Cell Genetics 7:603, 1981: 
Graham and Van der Eb, Virology 52:456, 1973), electropo- 

45 ration (Neumann et al., EMBOJ. l:841--845, 1982), DEAE- 
dextran mediated transfection (Ausubel et al., eds., Current 
Protocols in Molecular Biology, John Wiley and Sons, Inc., 
NY, 1987), liposome-mediated transfection (Hawley-Nelson 
et al., Focus 15:73, 1993; Ciccarone et al.. Focus 15:80, 

50 1993), and viral vectors (Miller and Rosman, BioTeclmiques 
7:980-90, 1989; Wang and Finer, Nature Med. 2:714-16, 
1996). The production of recombinant polypeptides in cul- 
tured mammalian cells is disclosed, for example, by 
Levinson et al, U.S. Pal No. 4,713,339; Hagen et al, U.S. 

55 Pat. No. 4,784,950; Palmiter et al, U.S. Pat. No. 4,579,821; 
and Ringold, U.S Pat. No. 4,656,134. Suitable cultured 
mammalian cells include the COS-1 (ATCC No. CRL 1650), 
COS-7 (ATCC No. CRL 1651), BHK 570 (AlCC No. CRL 
10314), 293 (ATCC No. CRL 1573; Graham et al, J. Gen. 

60 ViroL 36:59-72, 1977) and Chinese hamster ovary (e.g. 
CHO-Kl; ATCC No. CCL 61) cell lines. Additional suitable 
cell lines are known in the art and available from pubhc 
depositories such as the American Type Culture Collection, 
Rockville, Md. In general, strong transcription promoters 

65 arc preferred, such as promoters from SV-40 or cytomega- 
lovirus. See, e.g., U.S. Pat. No. 4,956,288. Other suitable 
promoters include those from metallothionein genes (U.S. 
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Pat. Nos. 4^79^21 and 4,601,978) and the adenovirus 
major late promoter. 

Drug selection is generally used to select for cultured 
mammalian cells into which foreign DNA has been inserted. 
Such cells are commonly referred to as "transfectants". Ceils 
that have been cultured in the presence of the selective agent 
and are able to pass the gene of interest to their progeny are 
referred to as "stable Iransfectants." A preferred selectable 
marker is a gene encoding resistance to the antibiotic 
neomycin. Selection is carried out in the presence of a 
ncomycin-type drug, such as G-418 or the like. Selection 
systems may also be used to increase the expression level of 
the gene of interest, a process referred to as "amplification." 
Amplification is carried out by culturing transfectants in the 
presence of a low level of the selective agent and then 
increasing the amount of selective agent to select for cells 
that produce high levels of the products of the introduced 
genes. A preferred amplifiable selectable marker is dihydro- 
foiate reductase, which confers resistance to methotrexate. 
Other drug resistance genes (e.g., hygromycin resistance, 
multi-drug resistance, puromycin acetyltransferase) can also 
be used. Alternative markers that introduce an altered 
phenotype, such as green fluorescent protein, or cell surface 
proteins such as CD4, CDS, Class I MHC, placental alkaline 
phosphatase may be used to sort transfected cells from 
untransfected cells by such means as FACS sorting or 
magnetic bead separation technology. 

Other higher eukaryotic cells can also be used as hosts, 
including plant cells, insect cells and avian cells. The use of 
Agrobacteriwn rhizogenes as a vector for expressing genes 
in plant cells has been reviewed by Sinkar et al., J. Biosci. 
(angalore) 11:47-58, 1987. Transformation of insect cells 
and production of foreign polypeptides therein is disclosed 
by Guarino et al., U.S. Pat. No. 5,162,222 and WIPO 
publication WO 94/06463. In.seci cells can be infected with 
recombinant baculovirus; commonly derived from 
Autographa californica nuclear polyhcdrosis virus 
(AcNPV). DNA encoding the zpep 10 polypeptide is inserted 
into the baculoviral genome in place of the AcNPV poly- 
hedrin gene coding sequence by one of two methods. The 
first is the traditional method of homologous DNA recom- 
bination between wild-type AcNPV and a transfer vector 
containing the zpep 10 flanked by AcNPV sequences. Suit- 
able insect cells, e.g. SF9 cefls, are infected with wild-type 
AcNPV and transfected with a transfer vector comprising a 
zpep 10 polynucleotide operably linked to an AcNPV poly- 
hedrin gene promoter, terminator, and flanking sequences. 
See, King and Possee, The Baculovirus Expression System: 
A Laboratory Guide, London, Chcipman & Hall; O^ReiUy et 
al., Baculovirus Expression Vectors: A Laboratory Manual, 
New York, Oxford University Press., 1994; and, Richardson, 
Ed., Baculovirus Expression Protocols, Methods in Molecu- 
lar Biology, Tolowa, N.J., Humana Press, 1995. Naniral 
recombination within an insect ceU will result in a recom- 
binant baculovirus which contains zpeplO driven by the 
polyhedrin promoter. Recombinant viral stocks are made by 
methods commonly used in the art. 

'Vhc second method of making recombinant baculovirus 
utilizes a transposon-based system described by Luckow et 
al. (J. ViroL 67:4566-79, 1993). This system is sold in the 
Bac-to-Bac kit (Life Technologie.s, Rockville, Md.). This 
system utilizes a transfer vector, pFastBacl™ (Life 
Technologies) containing a Tn7 transposon to move the 
DNA encoding the zpeplO polypeptide into a baculovirus 
genome maintained in E. coll as a large plasmid called a 
"bacmid.*' The pFastBacl™ transfer vector utilizes the 
AcNPV polyhedrin promoter to drive the expression of the 
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gene of interest, in this case zpeplO. However, pFastBacl*™ 
can be modified to a considerable degree. The polyhedrin 
promoter can be removed and substituted with the baculovi- 
rus basic protein promoter (also known as Pcor, p6.9 or MP 

5 promoter) which is expressed earlier in the baculovirus 
infection, and has been shown to be advantageous for 
expressing secreted proteins. See, Hill-Perkins and Possee, 
y. Gen. Virol. 71:971-6, 1990; Bonning et aL,7. Gen. Viral. 
75:1551-6, 1994; and, Chazenbalk and Rapoport, 7. Biol. 

10 Cliem. 270:1543-9, 1995. In such transfer vector constructs, 
a short or long version of the basic protein promoter can be 
used. Moreover, transfer vectors can be constructed which 
replace the native zpeplO secretory signal sequences with 
secretory signal sequences derived from insect proteins. For 

15 example, a secretory signal sequence from Ecdysteroid 
Glucosyltransferase (EGT), honey bee Melittin (Invitrogen, 
Carlsbad, Calif.), or baculovirus gp67 (PharMingen, San 
Diego, Calif.) can be used in constructs to replace the native 
secretory signal sequence. In addition, transfer vectors can 

20 include an in-frame fusion with DNA encoding an epitope 
lag at the C- or N-terminus of the expressed zpeplO 
polypeptide, for example, a Glu-Glu epitope tag 
(Grussemneyer et al, ibid.) or FLAG tag (Kodak). Using a 
technique known in the art, a transfer vector containing 

25 ZpeplO is transformed into £. coH, and screened for bacmids 
which contain an interrupted lacZ gene indicative of recom- 
binant baculovirus. The bacmid DNA containing the recom- 
binant baculovirus genome is isolated, using common 
techniques, and used to transfect Spodoptera frugiperda 

30 cells, e.g. Sf9 cells. Recombinant virus that expresses 
ZpeplO is subsequently produced. Recombinant viral stocks 
are made by methods commonly used the art. 

Vht recombinant virus is used to infect host cells, typi- 
cally a cell line derived from the fall armyworm, Spodoptera 

35 frugiperda. See, in general, Glick and Pasternak, Molecular 
Biotechnology: Principles and Applications of Recombinant 
DNA, ASM Press, Washington, D.C., 1994. Another suitable 
cell line is the High FiveO*^^^ cell hne (Invitrogen) derived 
from Trichopliisia ni (U.S. Pat No. 5,300,435). Commer- 

40 cially available serum-free media are used to grow and 
maintain the cells. Suitable media are Sf900 IF" (Life 
Technologies) or ESF 921™ (Expression Systems) for the 
Sf9 cells; and Ex-ce 110405''^* (JRH Biosciences, Lenexa, 
Kans.) or Express FiveO"*^" (Life Technologies) for the T. ni 

45 cells. The cells are grown up from an inoculation density of 
approximately 2-5x10^ cells to a density of 1-2x10*^ cells at 
which time a recombinant viral stock is added at a multi- 
plicity of infection (MOI) of 0.1 to 10, more typically near 
3. The recombinant virus-infected cells typically produce 

50 the recombinant zpeplO polypeptide at 12-72 hours post- 
infection and secrete it with varying efficiency into the 
medium. The culture is usually harvested 48 hours post- 
infection. Centrifugation is used to separate the cells from 
the medium (supernatant). The supernatant containing the 

55 ZpeplO polypeptide is filtered through micropore filters, 
usually 0.45 jiim pore size. Procedures used are generally 
described in available laboratory manuals (King and Possee, 
ibid.; O'Reilly el al., ibid.; Richardson, C. D., ibid.). Sub- 
sequent purification of the zpeplO polypeptide from the 

60 .supernatant can be achieved u.sing methods described 
herein. 

Fungal cells, including yeast cells, can also be used within 
the present invention. Yeast species of particular interest in 
this regard include Saccharomyces cerevisiae, Pichia 
65 pastoris, and Pichia methanolica. Methods for transforming 
5. cerevisiae cells with exogenous DNA and producing 
recombinant polypeptides therefrom are disclosed by, for 
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example, Kawasaki, U.S. Pat. No, 4,599^11; Kawasaki et 
al, U.S. Pat. No. 4,931,373; Brake, U.S. Pat. No. 4,870,008; 
Welch et al., U.S. Pat. No. 5,037,743; and Murray et al., U.S. 
Pat. No. 4,845,075. Transformed cells are selected by phe- 
notype determined by the selectable marker, commonly drug 
resistance or the ability to grow in the absence of a particular 
nutrient (e.g., leucine). A preferred vector system for ase in 
Saccharomyces cerevisiae is the POTl vector system dis- 
closed by Kawasaki et al. (U.S. Pat. No. 4,931,373), which 
allows transformed cells to be selected by growth in 
glucose-containing media. Suitable promoters and termina- 
tors for use in yeast include those from glycolytic enzyme 
genes (see, e.g., Kawasaki, U.S. Pat. No. 4,599,311; Kings- 
man et al., U.S. Pat. No. 4,615,974; and Bitter, U.S. Pat. No. 
4,977,092) and alcohol dehydrogenase genes. See also U.S. 
Pat. Nos. 4,990,446; 5,063,154; 5,139,936 and 4,661,454. 
Transformation systems for other yeasts, including 
Hansenida polymorpha, Schizosaccharomyces ponibe, 
Kluyveromyces lactis, Kluyveromyces fragilis, Vstilago 
maydLSj Pichia pastoris, Pichia methanolica, Pichia guill- 
emiondii and Candida maltosa are known in the art. See, for 
example, Gleeson et aL, J, Gen. Microbiol 132:3459-65, 
1986 and Cregg, U.S. Pat. No. 4,882;279. Aspergillus cells 
may be utilized according to the methods of McKnigbt et al., 
U.S. Pat. No. 4,935349. Methods for transforming j4cm;io- 
nium chrysogenwn are disclosed by Sumino et al., U.S. Pat. 
No. 5,162,228. Methods for transforming Neurospora are 
disclosed by Lambowitz, U.S. Pat. No. 4,486,533. 

The use of Pichia methanolica as host for the production 
of recombinant proteins is disclosed in WIPO Publications 
WO 97/17450, WO 97/17451, WO 98/02536, and WO 
98/02565. DNA molecules for use in transforming/! metha- 
nolica will commonly be prepared as double-stranded, cir- 
cular plasmids, which are preferably linearized prior to 
transformation. For polypeptide production in P. 
methanolica, it is preferred that the promoter and terminator 
in the plasmid be that of a P methanolica gene, such as a P 
methanolica alcohol utilization gene (AUGl or AUG2). 
Other useful promoters include those of the dihydroxyac- 
etone synthase (DHAS), formate dehydrogenase (FMD), 
and catalase (CAT) genes. To facilitate integration of the 
DNA into the host chromosome, it is preferred to have the 
entire expression segment of the plasmid flanked at both 
ends by host DNA sequences. A preferred selectable marker 
for use in Pichia methanolica is a methanolica AD£2 45 
gene, which encodes phosphoribosyl-5-aminoimidazole car- 
boxylase (AIRC; EC 4.1.1.21), which allows ade2 host cells 
to grow in the absence of adenine. For large-scale, industrial 
processes where it is desirable to minimize the use of 
methanol, it is preferred to use host cells in which both 
methanol utilization genes (AUGl and AUG2) arc deleted. 
For production of secreted proteins, host cells deficient in 
vacuolar protease genes (PEP4 and PRBl) are preferred. 
Electroporation is used to facilitate the introduction of a 
plasmid containing DNA encoding a polypeptide of interest 
into P. methanolica cells. It is preferred to transform P. 
methanolica cells by electroporation using an exponentially 
decaying, pulsed electric field having a field strength of from 
2.5 to 4.5 kV/cm, preferably about 3.75 kV/cm, and a time 
constant (1) of from 1 to 40 milliseconds, most preferably 
about 20 milliseconds. 

Prokaryotic host cells, including strains of the bacteria 
Escherichia, Bacillus and other genera are also useful host 
cells within the present invention. Techniques for transform- 
ing these hosts and expressing foreign DNA sequences 
cloned therein are well known in the art (see, e.g., Sambrook 
et al., ibid.). When expressing a zpeplO polypeptide in 



bacteria such as E. coU, the polypeptide may be retained in 
the cytoplasm, typically as insoluble granules, or may be 
directed to the periplasmic space by a bacterial secretion 
sequence. In the former case, the cells are lysed, and the 
granules are recovered and denatured using, for example, 
guanidine isothitx:yanate or urea. The denatured polypeptide 
can then be refolded and dimeri/ed by diluting the 
denaturant, such as by dialysis against a solution of urea and 
a combination of reduced and oxidized glutathione, fol- 
lowed by dialysis against a buffered saline solution. In the 
latter case, the polypeptide can be recovered from the 
periplasmic space in a soluble and functional form by 
disrupting the cells (by, for example, sonication or osmotic 
shock) to release the contents of the periplasmic space and 
recovering the protein, thereby obviating the need for dena- 
turation and refolding. 

Transformed or transfected host cells are cultured accord- 
ing to conventional procedures in a culture medium con- 
taining nutrients and other components required for the 
growth of the chosen host cells. A variety of suitable media, 
including defined media and complex media, are known in 
the art and generally include a carbon source, a nitrogen 
source, essential amino acids, vitamins and minerals. Media 
may also contain such components as growth factors or 
25 serum, as required. The growth medium will generally select 
for cells containing the exogenously added DNA by, for 
example, drug selection or deficiency in an essential nutrient 
which is complemented by the selectable marker carried on 
the expression vector or co -transfected into the host cell. P. 
methanolica cells are cultured in a medium comprising 
adequate sources of carbon, nitrogen and trace nutrients at a 
temperature of about 25*^ C. to 35° C. Liquid cultures are 
provided with suflicienl aeration by conventional means, 
such as .shaking of small flasks or sparging of fermentors. A 
preferred culture medium for P. methanolica is YEPD (2% 
D-glucose, 2% Bacto"^" Peptone (Difco Laboratories, 
Detroit, Mich.), 1% Bacto™ yeast extract (Difco 
Laboratories), 0.004% adenine and 0.006% L- leucine). 

ZpepiO polypeptides or fragments thereof may also be 
prepared through chemical synthesis. ZpeplO polypeptides 
may be monomers or multimers; glycosylated or non- 
glycosylated; pegylated or non-pegylated; and may or may 
not include an initial methionine amino acid residue. 

Expressed recombinant zpeplO polypeptides (or chimeric 
ZpeplO polypeptides) can be purified using fractionation 
and/or conventional purification methods and media. 
Ammonium sulfate precipitation and acid or chaotmpe 
extraction may be used for fractionation of samples. Exem- 
plary purification steps may include hydroxyapatite, size 
exclusion, FPLC and reverse-phase high performance liquid 
chromatography. Suitable anion exchange media include 
derivatized dcxtrans, agarose, cellulose, poly aery lamide, 
specialty silicas, and the like. DEAE Fast-Flow Sepharose 
(Pharmacia, Piscataway, N.J.), PEI, DEAE, OAE and Q 
derivatives are preferred. Exemplary chromatographic 
media include those media derivatized with phenyl, butyl, or 
octyl groups, such as Phenyl-Sepharose FF (Pharmacia), 
Toyopearl butyl 650 (Ibso Haas, Montgomeryville, Pa.), 
Octyl-Sepharose (Pharmacia) and the like; or polyacrylic 
resin.s, such as Amberchrom CG 71 (Toso Haas) and the like. 
Suitable .solid supports include glass beads, silica-ba.sed 
resins, cellulosic resias, agarose beads, cross-linked agarose 
beads, polystyrene beads, cross-linked poly aery lamide res- 
ins and the like that are insoluble under the conditions in 
which they arc to be used. These supports may be modified 
with reactive groups that allow attachment of proteins by 
amino groups, carboxyl groups, sulfhydryl groups, hydroxyl 
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groups and/or carbohydrate moieties. Examples of coupling 
chemistries include cyanogen bromide activation, 
N-hydroxysuccinimide activation, epoxide activation, sulf- 
hydryl activation, hydrazide activation, and carboxyl and 
amino derivatives for carbodiimide coupling chemistries. 
ITiese and other solid media are well known and widely used 
in the art, and are available from commercial suppliers. 
Methods for binding receptor polypeptides to support media 
are well known in the art. Selection of a particular method 
is a matter of routine design and is determined in part by the 
properties of the chosen support. Sec, for example. Affinity 
Chromatography: Principles <& Methods, Pharmacia LKB 
Biotechnology, Uppsala, Sweden, 1988. 

The zpepiO polypeptides of the present invention can be 
isolated by exploitation of their structural features. Within 
one embodiment of the invention are included a fusion of the 
polypeptide of interest and an affinity tag (e.g., 
polyhistidine, Glu — Glu, FLAG, maltose-binding protein, 
an immunoglobulin domain) that may be constructed to 
facilitate purification. 

Protein refolding (and optionally reoxidation) procedures 
may be advantageously used. It is preferred to purify the 
protein to >80% purity, more preferably to >90% purity, 
even more preferably >95%, and particularly preferred is a 
pharmaccutically pure state, that is greater than 99.9% pure 
with respect to contaminating macromolecules, particularly 
other proteins and nucleic acids, and free of infectious and 
pyrogenic agents. Preferably, a purified protein is substan- 
tially free of other proteins, particularly other proteins of 
animal origin. 

Proteins/polypeptides which bind zpeplO (such as a 
zpeplO-binding receptor or other membrane glycoprotein) 
can also be used for purification of zpeplO. ilie zpeplO- 
binding protein/polypeptide is immobilized on a solid 
support, such as beads of agarose, cross-linked agarose, 
glass, cellulosic resins, silica-based resins, polystyrene, 
cross-linked polyacrylamidc, or like materials that are stable 
under the conditions of use. Methods for linking polypep- 
tides to solid supports are known in the art, and include 
amine chemistry, cyanogen bromide activation, 
N-hydroxysuccinimide activation, epoxide activation, sulf- 
hydryl activation, and hydrazide activation. The resulting 
medium will generally be configured in the form of a 
column, and fluids containing zpeplO polypeptide are 
passed through the column one or more times to allow 
ZpeplO polypeptide to bind to the ligand-binding or receptor 
polypeptide. Ilie hound zpeplOpolypepiide is then eluted 
using changes in salt concentration, cbaotropic agents 
(guanidine HCl), or pH to disrupt ligand -receptor binding. 

In vitro and in vivo response to soluble zpeplO can also 
be measured using cultured cells or by administering mol- 
ecules of the claimed invention to the appropriate animal 
model. For instance, soluble zpeplO transfected expression 
host cells may be embedded in an alginate environment and 
injected (implanted) into recipient animals. Alginate-poly- 
L-lysine micro-encapsulation, permselective membrane 
encapsulation and difihision chambers have been described 
as a means to entrap transfected mammalian cells or primary 
mammalian cells, lliese types of non-immunogenic "encap- 
sulations" or microenvironments permit the transfer of nutri- 
ents into the microenvironment, and also permit the diffu- 
sion of proteins and other macromolecules secreted or 
released by the captured cells across the environmental 
barrier to the recipient animal. Most importantly, the cap- 
sules or microenvironments mask and shield the foreign, 
embedded cells from the recipient animal's immune 
response. Such microenvironments can extend the life of the 
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injected cells from a few hours or days (naked cells) to 
several weeks (embedded cells). 

Alginate threads provide a simple and quick means for 
generating embedded cells. The materials needed to gener- 

5 ate the alginate threads are readily available and relatively 
inexpensive. Once made, the alginate threads are relatively 
strong and durable, both in vitro and, based on data obtained 
using the threads, in vivo. The alginate threads are easily 
manipulable and the methodology is scalable for preparation 

10 of numerous threads. In an exemplary procedure, 3% algi- 
nate is prepared in sterile H2O, and sterile filtered. Just prior 
to preparation of alginate threads, the alginate solution is 
again filtered. An approximately 50% cell suspension 
(containing about 5xlC^ to about 5x10^ cells/ml) is mixed 

15 with the 3% alginate solution. One ml of the alginate/cell 
suspension is extruded into a 100 mM sterile filtered CaCU 
solution over a time period of -15 min, forming a "thread*'. 
'ITie extruded thread is then transferred into a solution of 50 
mM CaCU, and then into a solution of 25 mM CaCl2. The 

20 thread is then rinsed with deionized water before coating the 
thread by incubating in a 0.01% solution of poly-L-lysine. 
Finally, the thread is rinsed with Lactated Ringer's Solution 
and drawn from solution into a syringe barrel (without 
needle attached). A large bore needle is then attached to the 

25 syringe, and the thread is intraperitoneally injected into a 
recipient in a minimal volume of the Lactated Ringer's 
Solution. 

An alternative in vivo approach for assaying soluble 
proteins of the present invention involves viral delivery 

30 systems. Exemplary viruses for this purpose include 
adenovirus, herpesvirus, vaccinia virus and adeno- 
associated virus (AAV). Adenovirus, a double-stranded 
DNA virus, is currently the best studied gene transfer vector 
for delivery of heterologous nucleic acid (for a review, see 

35 Becker et al., Meth. Cell Biol 43:161-89, 1994; and Dou- 
glas and Curiel, Science & Medicine 4:44-53, 1997). The 
adenovirus system oflfcrs several advantages: adenovirus can 
(i) accommodate relatively large DNA inserts; (ii) be grown 
to high-titer; (iii) infect a broad range of mammalian cell 

40 types; and (iv) be used with a large number of available 
vectors containing different promoters. Also, because aden- 
oviruses are stable in the bloodstream, they can be admin- 
istered by intravenous injection. Some disadvantages 
(especially for gene therapy) associated with adenovirus 

45 gene delivery include: (i) very low efficiency integration into 
the host genome; (ii) existence in primarily episomal form; 
and (iii) the host immune response to the administered vims, 
precluding readministration of the adenoviral vector. 
By deleting portions of the adenovirus genome, larger 

50 inserts (up to 7 kb) of heterologous DNA can be accommo- 
dated. These inserts can be incorporated into the viral DNA 
by direct ligation or by homologous recombination with a 
co-transfected plasmid. In an exemplary system, the essen- 
tial LI gene has been deleted from the viral vector, and the 

55 virus will not replicate unless the El gene is provided by the 
host cell (the human 293 cell line is exemplary). When 
intravenously administered to intact animals, adenovirus 
primarily targets the liver. If the adenoviral delivery system 
has an El gene deletion, the virus cannot replicate in the host 

60 cells. However, the host's tissue (e.g., liver) will express and 
process (and, if a .secretory signal sequence is present, 
secrete) the heterologous protein. Secreted proteins will 
enter the circulation in the highly vascularized liver, and 
effects on the infected animal can be determined. 

65 The adenovirus system can also be used for protein 
production in vitro. By culturing adenovirus-infected non- 
293 cells under conditions where the cells are not rapidly 
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dividing, the cells can produce proteins for extended periods 
of time. For instance, BHK cells are grown to confluence in 

cell factories, then exposed to the adenoviral vector encod- 
ing the secreted protein of interest. The cells are then grown 
under serum-free conditions, which allows infected cells to 
survive for several weeks without significant cell division. 
Alternatively, adenovirus vector infected 293S cells can be 
grown in suspension culture at relatively high cell density to 
produce significant amounts of protein (see Gamier et al., 
Cytotechnol. 15:145-55, 1994). With either protocol, an 
expressed, secreted heterologous protein can be repeatedly 
isolated from the cell culture supernatant. Within the 
infected 293S cell production protocol, non-secreted pro- 
teins may also be effectively obtained. 

An assay system that uses a ligand-binding receptor (or an 
antibody, one member of a complement/anti-complement 
pair) or a binding fragment thereof, and a commercially 
available biosensor instrument (BIAcore™, Pharmacia 
Biosensor, Piscataway, N.J.) may be advantageously 
employed. Such receptor, antibody, member of a 
complement/anti-complement pair or fragment is immobi- 
lized onto the surface of a receptor chip. Use of this 
instrument is disclosed by Karl&son, J. Immunol Methods 
145:229-40, 1991 and Cunningham and Wells, J. MoL Biol. 
234:554-63, 1993, A receptor, antibody, member or frag- 
ment is covalenlly attached, using amine or sulfhydryl 
chemistry, to dextran fibers that are attached to gold film 
within the flow cell. A test sample is passed through the cell. 
If a ligand, epitope, or opposite member of the complement/ 
anti-complement pair is present in the sample, it will bind to 
the immobilized receptor, antibody or member, respectively, 
causing a change in the refractive index of the medium, 
which is detected as a change in surface plasmon resonance 
of the gold film. This system allows the determination of on- 
and off-rates, from which binding affinity can be calculated, 
and assessment of stoichiometry of binding. As used herein, 
the term complcmcnt/anti-complcmcnt pair denotes non- 
identical moieties that form a non-covalcntly associated, 
stable pair under appropriate conditions. For instance, biotin 
and avidin (or streptavidin) are prototypical members of a 
complement/anti-complement pair. Other exemplary 
complement/anti-complement pairs include receptor/ligand 
pairs, antibody/antigen (or hapten or epitope) pairs, sense/ 
antisense polynucleotide pairs, and the like. Where subse- 
quent dissociation of the complement/anti-complement pair 
is desirable, the complement/an ti -complement pair prefer- 
ably has a binding atTlnity of <\if 

ZpeplO polypeptide and other ligand homologs can also 
be used within other assay systems known in the art. Such 
systems include Scatchard analysis for determination of 
binding a£5nity (see Scatchard, Arm, NY Acad. Sci. 51: 
660-72, 1949) and calorimetric assays (Cunningham et al., 
Science 253:545-8, 1991; Cunningham et aL, Science 
245:821-5, 1991). 

The invention also provides anti-zpeplO antibodies. Anti- 
bodies to ZpeplO can be obtained, for example, using as an 
antigen the product of a zpeplO expression vector, or zpeplO 
isolated from a natural source. Particularly useful anti- 
zpeplO antibodies "bind specifically" with zpeplO. Antibod- 
ies are considered to be specifically binding if the antibodies 
bind It) a zpeplO polypeptide, peptide or epitope with a 
binding afiBnity (K„) of 10^ M"^ or greater, preferably 10^ 
M"^ or greater, more preferably 10* M"^ or greater, and most 
preferably 10^ M""^ or greater. The binding affinity of an 
antibody can be readily determined by one of ordinary skill 
in the art, for example, by Scatchard analysis (Scatchard, 
Ann. NY Acad. Sci. 51:660, 1949). Suitable antibodies 
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include antibodies that bind with zpeplO, in particular the 
extracellular domain of zpeplO (amino acid residues 21-111 
of SEQ ID N0:2). 

Anti-zpeplO antibodies can be produced using antigenic 

5 ZpeplO epitope -bearing peptides and polypeptides. Anti- 
genic epitope-bearing peptides and polypeptides of the 
present invention contain a sequence of at least nine, pref- 
erably between 15 to about 30 amino acids contained within 
SEQ ID N0:2. However, peptides or polypeptides compris- 

10 ing a larger portion of an amino acid sequence of the 
invention, containing from 30 to 50 amino acids, or any 
length up to and including the entire amino acid sequence of 
a polypeptide of the invention, also are useful for inducing 
antibodies that bind with zpeplO. It is desirable that the 

15 amino acid sequence of the epitope-bearing peptide is 
selected to provide substantial solubility in aqueous solvents 
(i.e., the sequence includes relatively hydrophilic residues, 
while hydrophobic residues are preferably avoided). ITie 
hydrophobicity plot provided in the FIGURE provides such 

20 information. Using the plot antigenic regions can be 
selected, such as those found in the fragments, amino acid 
residue 39-44, 65-70, 38-43, 62-^7 and 96-101 of SEQ ID 
N0:2. Moreover, amino acid sequences containing proline 
residues may be also be desirable for antibody production. 

25 Polyclonal antibodies to recombinant zpcplOprotein or to 
ZpeplO isolated from natural sources can be prepared using 
methods well-known to those of skill in the art. See, for 
example. Green et al., "Production of Polyclonal Antisera,*' 
in Immunochemical Protocols (Manson, ed.), pages 1-5 

30 (Humana Press 1992), and WiUiams et al., "Expression of 
foreign proteins in E, coli using plasmid vectors and puri- 
fication of specific polyclonal antibodies," in DNA Cloning 
2: Expression Systems, 2nd Edition, Glover et al. (eds.), 
page 15 (Oxford University Pre.ss 1995). The immunoge- 

35 nicity of a zpeplO polypeptide can be increased through the 
use of an adjuvant, such as alum (aluminum hydroxide) or 
Freund*s complete or incomplete adjuvant. Polypeptides 
useful for immunization also include fusion polypeptides, 
such as fusions of zpeplO or a portion thereof with an 

40 immunoglobulin polypeptide or with maltose binding pro- 
tein. The polypeptide immunogen may be a full-length 
molecule or a portion thereof. If the polypeptide portion is 
"hapten- like," such portion may be advantageously joined or 
linked to a macromolecular carrier (such as keyhole limpet 

45 hemocyanin (KLH), bovine serum albumin (BSA) or tetanus 
toxoid) for immunization. 

Although polyclonal antibodies are typically raised in 
animals such as horses, cows, dogs, chicken, rats, mice, 
rabbits, hamsters, guinea pigs, goats or sheep, an anti- 

50 ZpeplO antibody of the present invention may also be 
derived from a subhuman primate antibody. General tech- 
niques for raising diagnostically and therapeutically useful 
antibodies in baboons may be found, for example, in Gold- 
enberg et al., international patent publication No. WO 

55 91/11465, and in Losman et al.,/m. /. Cancer 46:310, 1990. 
Antibodies can also be raised in transgenic animals such as 
transgenic sheep, cows, goats or pigs, and may be expressed 
in yea.st and fungi in modified forms as will as in mammalian 
and insect cells. 

60 Alternatively, monoclonal anli-zpeplOantibodies can be 
generated. Rodent monoclonal antibodies to specific anti- 
gens may be obtained by methods known to those skilled in 
the art (see, for example, Kohler et al.. Nature 256:495 
(1975), Coligan et al. (eds.). Current Protocols in 

65 Immunology, Vol. 1, pages 2.5.1-2.6 J (John Wiley & Sons 
1991), Picksley et al., "Production of monoclonal antibodies 
against proteins expressed in E. coli" in DNA Cloning 2: 
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Expression Systems, 2nd Edition, Glover et al. (eds.), page 
93 (Oxford University Press 1995)). 

Briefly, monoclonal antibodies can be obtained by inject- 
ing mice with a composition comprising a zpeplO gene 
product, verifying the presence of antibody production by 5 
removing a serum sample, removing the spleen to obtain 
B-lymphocyles, fusing the B-lymphocyles with myeloma 
cells to produce hybridomas, cloning the hybridomas, select- 
ing positive clones which produce antibodies to the antigen, 
culturing the clones that produce antibodies to the antigen, lO 
and isolating the antibodies from the hybridoraa cultures. 

In addition, an anti-zpcplO antibody of the present inven- 
tion may be derived from a human monoclonal antibody. 
Human monoclonal antibodies are obtained from transgenic 
mice that have been engineered to produce specific human 15 
antibodies in response to antigenic challenge. In this 
technique, elements of the human heavy and light chain 
locus are introduced into strains of mice derived from 
embryonic stem cell lines that contain targeted disruptions 
of the endogenoas heavy chain and light chain loci. The 20 
transgenic mice can synthesize human antibodies specific 
for human antigens, and the mice can be used to produce 
human antibody>secreting hybridomas. Methods for obtain- 
ing human antibodies from transgenic mice are described, 
for example, by Green et ah, Nat. Genet. 7:13, 1994, 25 
Lonberg et al.. Nature 368:856, 1994, and Taylor et al.. Int. 
Immun. 6:579, 1994. 

Monoclonal antibodies can be isolated and purified from 
hybridoma cultures by a variety of well-established tech- 
niques. Such isolation techniques include aflSnity chroma- 30 
tography with Protein-A Sepharose, size-exclusion 
chromatography, and ion-exchange chromatography (see, 
for example, Coligan at pages 2.7.1-2.7.12 and pages 
2.9.1-2.9.3; Baines et al., "Purification of Immunoglobulin 
G (IgG)," in Methods in Molecular Biology, Vol. 10, pages 35 
79-104 (The Humana Press, Inc. 1992)). 

For particular uses, it may be desirable to prepare frag- 
ments of anti-zpeplO antibodies. Such antibody fragments 
can be obtained, for example, by proteolytic hydrolysis of 
the antibody. Antibody fragments can be obtained by pepsin 40 
or papain digestion of whole antibodies by conventional 
methods. As an illustration, antibody fragments can be 
produced by enzymatic cleavage of antibodies with pepsin 
to provide a 5S fragment denoted F(ab')2. This fragment can 
be fiirther cleaved using a thiol reducing agent to produce 45 
3.5 S Fab' monovalent fragments. Optionally, the cleavage 
reaction can be performed using a blocking group for the 
sulfhydryl groups that result from cleavage of disulfide 
linkages. As an alternative, an enzymatic cleavage using 
pepsin produces two monovalent Fab fragments and an Fc 50 
fragment directly. These methods are described, for 
example, by Goldenbcrg, U.S. Pat. No. 4,331,647, Nisonoff 
et al, Arch Biochem. Biophys. 89:230, 1960, Porter, Bio- 
chem, J. 73:119, 1959, Edelman et al, in Methods in 
Enzymology Vol. 1, page 422 (Academic Press 1967), and by 55 
Coligan, ibid. 

Other methods of cleaving antibodies, such as separation 
of heavy chains to form monovalent Ught-heavy chain 
fragments, further cleavage of fragments, or other 
en/ymalic, chemical or genetic techniques may also be used, 60 
so long as the fragments bind to the antigen that Ls recog- 
nized by the intact antibody. 

For example, Fv fragments comprise an association of 
and chains. This association can be noncovalent, as 
described by Inbar et al., Proc. Nat' I Acad. Sci. USA 65 
69:2659, 1972. Alternatively, the variable chains can be 
linked by an intermolecular disulfide bond or cross-linked 



by chemicals such as gluteraldehyde (see, for example, 
Sandhu, Cnt Rev. Biotech. 12:437, 1992). 

The Fv fragments may comprise and chains which 
are connected by a peptide linker. These single-chain antigen 
binding proteins (scFv) are prepared by constructing a 
structural gene comprising DNA sequences encoding the V,^^ 
and domains which are connected by an oligonucleotide. 
The structural gene is inserted into an expression vector 
which is subsequently introduced into a host cell, such as£. 
coli. The recombinant host cells synthesize a single polypep- 
tide chain with a linker peptide bridging the two V domains. 
Methods for producing scFvs arc described, for example, by 
Whitlow et al.. Methods: A Companion to Methods in 
Enzymology 2:97, 1991, also see, Bird et al.. Science 
242:423, 1988, Ladner et al., U.S. Pat. No. 4,946,778, Pack 
et al., Bio/Technology 11:1271, 1993, and Sandhu, supra. 

As an illustration, a scFV can be obtained by exposing 
lymphocytes to zpeplO polypeptide in vitro, and selecting 
antibody display libraries in phage or similar vectors (for 
instance, through use of immobilized or labeled zpeplO 
protein or peptide). Genes encoding polypeptides having 
potential zpeplO polypeptide binding domains can be 
obtained by screening random peptide libraries displayed on 
phage (phage display) or on bacteria, such as E. coli. 
Nucleotide sequences encoding the polypeptides can be 
obtained in a number of ways, such as through random 
mutagenesis and random polynucleotide synthesis. These 
random peptide display libraries can be used to screen for 
peptides which interact with a known target which can be a 
protein or polypeptide, such as a Hgand or receptor, a 
biological or synthetic macromolecule, or organic or inor- 
ganic substances. Techniques for creating and screening 
such random peptide display libraries are known in the art 
(Ladner et al., U.S. Pat. No. 5,223,409, Udner et al., U.S. 
Pat. No. 4,946,778, Udner et al., U.S. Pat. No. 5,403,484, 
Ladner et al., U.S. Pat. No. 5,571,698, and Kay et slI, Phage 
Display of Peptides and Proteins (Academic Press, Inc. 
1996)) and random peptide display libraries and kits for 
screening such libraries are available commercially, for 
instance from Clontech (Palo Alto, Calif.), Invitrogen Inc. 
(San Diego, Calif.), New England Biolabs, Inc. (Beverly, 
Mass.), and Pharmacia LKB Biotechnology Inc. 
(Piscataway, N.J.). Random peptide display libraries can be 
screened using the zpeplO sequences disclosed herein to 
identify proteins which bind to zpeplO. 

Another form of an antibody fragment is a peptide coding 
for a single complementarity-determining region (COR). 
CDR peptides ("minimal recognition units") can be obtained 
by constructing genes encoding the CDR of an antibody of 
interest. Such genes are prepared, for example, by using the 
polymerase chain reaction to synthesize the variable region 
firom RNA of antibody-producing cells (sec, for example, 
Larrick et al., Methods: A Companion to Methods in Enzy- 
mology 2:106, 1991), Courtenay-Luck, "Genetic Manipula- 
tion of Monoclonal Antibodies," in Monoclonal Antibodies: 
Production, Engineering and Clinical Application, Ritter et 
al (eds.), page 166 (Cambridge University Press 1995), and 
Ward et al., "Genetic Manipulation and Expression of 
Antibodies," in Monoclonal Antibodies: Principles and 
Applications, Birch el al., (eds.), page 137 (Wiley-Liss, Inc. 
1995)). 

Alternatively, an anti-zpepIO antibody may be derived 
from a "humanized" monoclonal antibody. Humanized 
monoclonal antibodies are produced by transferring mouse 
complementary determining regions from heavy and light 
variable chains of the mouse immunoglobulin into a human 
variable domain. Typical residues of human antibodies are 
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then substituted in the framework regions of the murine 
counterparts. The use of antibody components derived from 
humanized monoclonal antibodies obviates potential prob- 
lems associated with the immunogenicity of murine constant 
regions. General techniques for cloning murine immunoglo- 
bulin variable domains are described, for example, by 
Orlandi et al., Proc. Natl Acad. Sci. USA 86:3833, 1989. 
Techniques for producing humanized monoclonal antibodies 
are described, for example, by Jones et al., Nature 321:522, 
1986, Carter el al., Froc. Natl. Acad. ScL USA 89:4285, 
1992, Sandhu, Crit. Rev. Biotedi. 12:437, 1992, Singer et al, 
J. Immun, 150:2844, 1993, SxL^\i\x{t6), Antibody Engineer- 
ing Protocols (Humana Press, Inc. 1995), Kelley, "Engi- 
neering Therapeutic Antibodies," in Protein Engineering: 
Principles and Practice, Cleland et al. (eds.), pages 399-434 
(John Wiley & Sons, Inc. 1996), and by Queen et al,, U.S. 
Pat. No. 5,693,762 (1997). 

Polyclonal anti-idiotype antibodies can be prepared by 
immunizing animals with anti-zpcplO antibodies or anti- 
body fragments, using standard techniques. See, for 
example. Green et al., "Production of Polyclonal Antisera," 
in Methods In Molecular Biology: Immunochemical 
Protocols, Manson (ed.), pages 1-12 (Humana Press 1992). 
Also, see Coligan, ibid, at pages 2.4.1-2.4.7. Alternatively, 
monoclonal anti-idiotype antibodies can be prepared using 
anti-zpeplO antibodies or antibody fragments as immuno- 
gens with the techniques, described above. As another 
alternative, humanized anti-idiotype antibodies or subhu- 
man primate anti-idiotype antibodies can be prepared using 
the above-described techniques. Methods for producing 
anti-idiotype antibodies are described, for example, by Irie, 
U.S. Pat. No. 5,208,146, Greene, et. al., U.S. Pat. No. 
5,637,677, and Varthakavi and Minocha, J. Gen. Virol. 
77:1875, 1996. 

Antibodies or polypeptides herein can also be directly or 
indirectly conjugated to drugs, toxins, radionuclides and the 
like, and these conjugates used for in vivo diagnostic or 
therapeutic applications. For instance, polypeptides or anti- 
bodies of the present invention can be used to identify or 
treat tissues or organs that express a corresponding anti- 
complementary molecule (receptor or antigen, respectively, 
for instance). More specifically, zpeplO polypeptides or 
anli-zpeplO antibodies, or bioactive fragments or portions 
thereof, can be coupled to detectable or cytotoxic molecules 
and delivered to a mammal having cells, tissues or organs 
that express the anti-complementary molecule. 

Suitable detectable molecules may be directly or indi- 
rectly attached to the polypeptide or antibody, and include 
radionuclides, enzymes, substrates, cofactors, inhibitors, 
fluorescent markers, chemiluminescent markers, magnetic 
particles and the like. Suitable cytotoxic molecules may be 
directly or indirectly attached to the polypeptide or antibody, 
and include bacterial or plant toxins (for instance, diphtheria 
toxin, Pseudomonas exotoxin, ricin, abrin and the like), as 
well as therapeutic radionuclides, such as iodine-131, 
rhenium-188 or yttrium-90 (either directly attached to the 
polypeptide or antibody, or indirectly attached through 
means of a chelating moiety, for instance). Polypeptides or 
antibodies may also be conjugated to cytotoxic drugs, such 
as adriamycin. For indirect attachment of a detectable or 
cytotoxic molecule, the detectable or cytotoxic molecule can 
be conjugated with a member of a complementary/ 
anticomplementary pair, where the other member is bound 
lo the polypeptide or antibody portion. For Ihese purposes, 
biotin/streptavidin is an exemplary complementary/ 
anticomplementary pair. 

Soluble ZpeplO polypeptides or antibodies to zpeplO can 
be directly or indirectly conjugated to drugs, toxins, radio- 
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nuclides and the like, and these conjugates used for in vivo 
diagnostic or therapeutic applications. For instance, 
polypeptides or antibodies of the present invention can be 
used to identify or treat tissues or organs that express a 
5 corresponding anti-complementary molecule (receptor or 
antigen, respectively, for instance). More specifically, 
ZpeplO polypeptides or anti-zpepIO antibodies, or bioactive 
fragments or portions thereof, can be coupled to detectable 
or cytotoxic molecules and delivered to a mammal having 
cells, tissues or organs that express the anti-complementary 
molecule. 

Suitable detectable molecules can be directly or indirectly 
attached to the polypeptide or antibody, and include 
radionuclides, enzymes, substrates, cofactors, inhibitors, 
fluorescent markers, chemiluminescent markers, magnetic 
particles and the like. Suitable cytotoxic molecules can be 
directly or indirectly attached to the polypeptide or antibody, 
and include bacterial or plant toxins (for instance, diphtheria 
toxin, Pseudomonas exotoxin, ricin, abrin and the like), as 
well as therapeutic radionuclides, such as iodine-131, 

20 rhenium-188 or yttrium-90 (either directly attached to the 
polypeptide or antibody, or indirectly attached through 
means of a chelating moiety, for instance). Polypeptides or 
antibodies can also be conjugated to cytotoxic drugs, such as 
adriamycin. For indirect attachment of a detectable or cyto- 

25 toxic molecule, the detectable or cytotoxic molecule can be 
conjugated with a member of a complementary/ 
anticomplementary pair, where the other member is bound 
to the polypeptide or antibody portion. For these purposes, 
biotin/streptavidin is an exemplary complementary/ 

30 anticomplementary pair. 

Such polypeptide-toxin fusion proteins or antibody/ 
fragment-toxin fusion proteins can be used for targeted cell 
or tissue inhibition or ablation (for instance, to treat cancer 
cells or tissues). Alternatively, if the polypeptide has mul- 

35 tiple functional domains (i.e., an activation domain or a 
ligand binding domain, plus a targeting domain), a fusion 
protein including only the targeting domain can be suitable 
for directing a detectable molecule, a cytotoxic molecule or 
a complementary molecule to a cell or tissue type of interest. 

40 In instances where the domain only fusion protein includes 
a complementary molecule, the anti-complementary mol- 
ecule can be conjugated to a detectable or cytotoxic mol- 
ecule. Such domain -complementary molecule fusion pro- 
teins thus represent a generic targeting vehicle for ceU/ 

45 tissue-specific delivery of generic anti-coraplementary- 
deteclable/cytotoxic molecule conjugates. The bioactive 
polypeptide or antibody conjugates described herein can be 
delivered intravenously, intraarterially or intraductally, or 
may be introduced locally at the intended site of action. 

50 The ZpeplO gene is almost exclusively expressed in the 
testis. Low levels of transcript are also seen in a number of 
other tissues, with the kidney accounting for most of the 
ancillary expression. The tissue specificity observed for 
zpeplO suggests a general role in development and regula- 

55 tory control of testicular diflferentiation and gonadal ste- 
roidogenesis and spermatogenesis. ZpeplO polypeptides, 
agonists and antagonists have enormous potential in both in 
vitro and in vivo applications. 

Development of testicular hormone production can be 

60 divided into early and late steps, with the latter dependent on 
the activation of runclionally-delermined Leydig cell pre- 
cursors by LH. However, the factors that control the early 
steps in this process remain unknown (Huhtaniemi, Reprod. 
Fertil Dev. 7: 1025-35, 1995) suggesting that testis specific 

65 polypeptides such as zpcplO might be responsible for acti- 
vation of a non-stcroidogcnic, non-LH responsive precursor 
cell. 
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Once Leydig cell differentiation has occurred, production 
of steroid hormones in the testis is dependent on the secre- 
tion of the gonadotrophins, LH and FSH, by the pituitary. 
LH stimulates production of testosterone by the Leydig 
cells, whereas spermatogenesis depends on both FSH and 
high intratesticular testosterone concentrations. LH and FSH 
secretion is in turn under control of gonadolrophin releasing 
hormone (GnRH) produced in the hypothalamus (Kaufman, 
The neuro endocrine regulation of male reproduction, in: 
Male Infertility. Clinical Investigation, Cause Evaluation 
and Treatment., F H Corahairc, cd.. Chapman and Hall, 
London, pp 29-54, 1996). Since testicular products have 
been shown to control LH and FSH production and in turn, 
these products regulate, testicular function, this suggests a 
regulatory role for zpepiO in hormone production by the 
hypothalamic, pituitary, gonadal axis. 

It is well known that steroidogenesis and spermatogenesis 
take place within two different cellular compartments of the 
testes, with Leydig and Sertoli cells responsible for the 
former and latter, respectively (Saez, Endocrin. Rev. 15: 
574-626, 1994). The activity of each of these cell types 
appears to be regulated by the secretory products of the 
other. Sertoli cell derived tumor necrosis factor-a, fibroblast 
growth factor, interleukin-1 transforming growth factor- p, 
epidermal growth factor/transforming growth factor-a, 
activin, inhibin, insulin-like growth factor- 1, platelet derived 
growth factor, endothelin, and ariginine-vasopressin have all 
been shown to regulate Leydig cell function (Saez, Endo- 
crin. Re\>. 15: 574-626, 1994). Thus, zpeplO might control 
or modulate the activities of one or more of these genes. 

The membrane glycoprotein zpeplO may also function as 
a binding site for one or more growth factor peptides or 
hormones in much the same way that heparin binds with 
platelet-derived growth factor (PDGF), fibroblast growth 
factors (such as aFGF and bFGF) and vascular endothelial 
growth factor (VFGF) and sequesters them on the cell 
surface. 

In men, aging is associated with a progressive decline in 
testicular function. These changes are manifest clinically by 
decreased virility, vigor, and hbido that point towards a 
relative testicular deficiency (Vermeulen, Ann. Med. 
25:531-4, 1993; Pugeat et al., Horm. Res. 43: 104-10, 
1995). Hormone replacement therapy in elderly men is not 
currently recommended which suggests that a new therapy 
for the male climacterium would be very valuable. ZpeplO 
polypeptides, agonists or antagonists, either independently 
or in combination with other factors, may be evaluated 
therapeutically. 

Soluble ZpeplO polypeptides, zpeplO agonists and/or 
ZpeplO antagonists may also have therapeutic value in 
treatment of testicular cancer, infertility, or in the recovery 
of function following testicular surgery. 

The ability of zpeplO polypeptides and zpeplO agonists to 
stimulate proliferation or differentiation of testicular cells 
can be measured using cultured testicular cells or in vivo by 
administering molecules of the present invention to the 
appropriate animal model. Cultured testicular cells include 
dolphin DBl.Tes cells (CRL-6258); mouse GC-1 spg cells 
(CRL-2053); 'm3 cells (CRL-1714); rM4 cells (CRL- 
1 715); and pig ST cells (CRL-1 746), available from Ameri- 
can Type Culture Collection, 10801 University Boulevard, 
Manassas, Va. Assays measuring cell proliferation or differ- 
entiation are well known in the art. For example, assays 
measuring proliferation include such assays as chemosensi- 
tivity to neutral red dye (Cavanaugh ct al.. Investigational 
New Drugs 8:347-354, 1990, incorporation of radiolabcUed 
nucleotides (Cook et al., Anal. Biochem, 179:1-7, 1989), 
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incorporation of 5-bromo-2'-deoxyuridine (BrdU) in the 
DNA of proliferating cells (Porstmann et al., J. Immunol. 
Methods 82:169-79, 1985), and use of tetrazolium salts 
(Mosmann, J. Immunol. Methods 65:55-63, 1983; Alley et 

5 al.. Cancer Res. 48:589-601, 1988; Marshall et al.. Growth 
Reg. 5:69-84, 1995; and Scudiero et al.. Cancer Re.s. 
48:4827-33, 1988) and by measuring proliferation using 
^H -thymidine uptake (Crowley et al., J. Immunol. Meth. 
133:55-66, 1990). Assays measuring differentiation include, 

lu for example, measuring cell-surface markers associated with 
stage-specific expression of a tissue, enzymatic activity, 
functional activity or morphological changes (Watt, FASEBj 
5:281-^, 1991; Francis, Differentiation 57:63-75, 1994; 
Raits, Adv. Anim. Cell Biol. Technol Bioprocesses, 161-71, 

15 1989). 

ZpeplO polypeptides, agonists and antagonists will also 
prove useful in the study of spermatogenesis and infertility. 
In vivo, ZpeplO agonists may find application in the treat- 
ment of male infertility. ZpeplO antagonists may be useful 

20 as male contraceptive agents. Zpepl 0 antagonists are aseful 
as research reagents for characterizing sites of ligand- 
receptor interaction. 

In vivo assays, well known in the art, are available for 
evaluating the effect of zpeplO ligands and agonists on 

25 testes. For example, compounds can be injected intrapcri- 
toneally for a specific time duration. After the treatment 
period, animals are sacrificed and testes removed and 
weighed. Testicles are homogenized and sperm head counts 
are made (Meistrich et al.,£'x/?. Cell Res. 99:72-8, 1976). 

30 Other activities, for example, chemotaxic activity that may 
be associated with proteins of the present invention can be 
analyzed. For example, late stage factors in spermatogenesis 
are involved in egg-sperm interactions and sperm motility. 
Activities, .such as enhancing \nability of cryopreserved 

35 sperm, stimulating the acro.some reaction, enhancing sperm 
motility and enhancing egg-sperm interactions may be asso- 
ciated with the ligands and agonists of the present invention. 
Assays evaluating such activities arc known (Roscnberger, 
/. Androl 11:89-96, 1990; Fuchs, Zentralbl. Gynakol 

40 11:117-120, 1993; Neurwingeret aUAndrologia 22:335-9, 
1990; Harris et al.. Human Reprod 3:856-60, 1988; and 
}ocktnhovt\, Andrologia 22:171-178, 1990; Lessing et al, 
Fertil Sterii 44:406-9, 1985; Zaneveld, In Male Infertility 
Chapter 11, Comhaire Ed., Chapman & Hall, London 1996). 

45 These activities are expected to result in enhanced fertility 
and successful reproduction. 

I.x>calization of zpepl 0 to testis tissue suggests zpeplO, its 
agonists and/or antagonists may have applications in 
enhancing fertilization during assisted reproduction in 

5U humans and in animals. Such assisted reproduction methods 
are known in the art and include artificial insemination, in 
vitro fertilization, embryo transfer and gamete intrafallopian 
transfer. Such methods are useful for assisting men and 
women who may have physiological or metabolic disorders 

55 that prevent natural conception. Such methods are also used 
in animal breeding programs, such as for livestock, zoologi- 
cal animals, endangered species or racehorses and could be 
used as methods for the creation of transgenic animals. 
To verify the presence of this capability in zpeplO 

60 polypeptides, agonists or antagonists of the present 
invention, such molecules are evaluated with respect to their 
ability to enhance viability of cryopreserved sperm, sperm 
motility, the ability of sperm to penetrate cervical mucus, 
particularly in association with methods of assisted 

65 reproduction, according to procedures known in the art (see 
for example, Juang et al., A/i/m. Reprod. Sci. 20:21-9, 1989; 
Juang et zX.yAmm. Reprod. Sci. 22:47-53, 1990; Colon et al.. 
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/^^rr/7.5reri/. 46:1133-39, 1986; Lessingetal.,Ferri/.5tenA use. The term "pharmaceutically acceptable carrier or 

44:406-9, 1985 and Brenner et al., Fertil. Steril 42:92-6, vehicle" refers to a carrier medium which docs not interfere 

1984). If desired, zpeplO polypeptide performance in this with the effectiveness of the biological activity of the active 

regard can be compared to relaxins and the like. In addition, ingredients and which is not toxic to the host or patient. One 

zpeplO polypeptides or agonists or antagonists thereof may 5 skilled in the art may formulate the compounds of the 

be evaluated in combination with one or more proteins to Present invention in an appropriate manner, and in accor- 

idenlify synergistic eflFecls. For example, soluble zpeplO, ^ance with accepted practices, such as those d^closed m 

agonists and/or antagonists can be added to "capacitation Remington: The Sconce and Practice of Pharmacy 

medial a cocktail of compounds known to activate sperm, ^^^^^'^^ ^ack Publishing Co., Easton, Pa., 19th ed., 

/!Ju\\^nf ''"u' ^'^''f cyclic adenosine monophosphate lo ^^^^ ^ ^ pharmaceutically effective amount of a 

(dbcAMP) or theophylline. Such mixtures have resulted in polypeptide, agonist or antagonist, is an amount 

improved reproductive function of the sperm, in particular, sufficient to induce a desired biological result. The result can 

sperm motility and zonae penetration (Park et al.. Am, 7. be alleviation of the signs, symptoms, or causes of a disease, 

Obstet. Gynecol 158:974-9, 1988; Vandevoort et al, Mol. any other desired aUeration of a biological system. For 

Repro, Develop. 37:299-304, 1993; Vandevoort and 15 example, an effective amount of a polypeptide of the present 

Overstreet, 7. AndroL 16:327-33, 1995). The capacitation invention is that which provides either subjective relief of 

mixture can then be combined with sperm, an egg or an symptoms or an objectively identifiable improvement as 

egg-sperm mixture prior to fertilization of the egg. noted by the clinician or other qualified observer. Doses of 

In cases where pregnancy is not desired, zpeplO polypeptide will generally be determined by the 
zpeplOpt)lypeptides or polypeptide fragments may function 20 clinician according to accepted standards, taking into 
as germ-cell-specific antigens for use as components in account the nature and severity of the condition to be treated, 
"immunocontraceptive" or "anti-fertility" vaccines to patient traits, etc. Determination of dose is within the level 
induce formation of antibodies and/or cell mediated immu- of ordinary skill in the art. The proteins may be administered 
nity to selectively inhibit a process, or processes, critical to for acute treatment, over one week or less, often over a 
successful reproduction in humans and animals. The use of 25 period of one to three days or may be used in chronic 
sperm and testis antigens in the development of an immu- treatment, over several months or years, 
nocontraceptive have been described (O'Hem et al, Biol Radiation hybrid mapping is a somatic cell genetic tech- 
Reprod. 52:311-39, 1995; Diekman and Herr, Am. J. nique developed for constructing high -resolution, contigu- 
Reprod. Immunol. 37:111-17, 1997; Zhu and Naz, Proc. ous maps of mammalian chromosomes (Cox et al.. Science 
Natl. Acad. Sci. USA 94:4704-9,1997). A vaccine based on 30 250:245-50, 1990). Partial or full knowledge of a gene's 
human chorionic gonadotrophin (HCG) linked to a diphthe- sequence allows the designing of PCR primers suitable for 
ria or tetanus carrier is currently in clinical trials (Talwar et use with chromosomal radiation hybrid mapping panels. 
dl.Proc. Nail. Acad. Sci. USA 91:8532-36, 1994). A single Radiation hybrid mapping panels are commercially avail- 
injection revsulted in production of high titer antibodies that able which cover the entire human genome, such as the 
persisted for neariy a year in rabbits (Stevens, Am. .7. 35 Stanford G3 RH Panel and the GeneBridge 4 RH Panel 
Reprod. Immunol. 29:176-88, 1993). Such methods of (Research Genetics, Inc., Huntsville, Ala.). These panels 
immunocontraccption using vaccines would include a enable rapid, PCR based, chromosomal localizations and 
ZpeplO testes-specific protein or fragment thereof. The ordering of genes, sequence-tagged sites (STSs), and other 
ZpeplO protein or fragments can be conjugated to a carrier nonpolymorphic and polymorphic markers within a region 
protein or peptide, such as tetanus or diphtheria toxoid. An 40 of interest. This includes establishing directly proportional 
adjuvant, as described above, can be included and the physical distances between newly discovered genes of inter- 
protein or fragment can be noncovalently associated with est and previously mapped markers. The precise knowledge 
other molecules to enhance intrinsic immunoreactivity. of a gene's position can be useful in a number of ways 
Methods for administration and methods for determining the including: 1) determining if a sequence is part of an existing 
number of administrations are known in the art. Such a 45 contig and obtaining additional surrounding genetic 
method might include a number of primary injectioas over sequences in various forms such as YAC-, BAC- or cDNA 
several weeks followed by booster injections as needed to clones, 2) providing a possible candidate gene for an inher- 
maintain a suitable antibody titer. itable disease which shows linkage to the same chromo- 

For pharmaceutical use, pharmaceutically effective somal region, and 3) for cross-referencing model organisms 

amounts of zpeplO therapeutic antibodies, small molecule 50 such as mouse which may be beneficial in helping to 

antagonists or agonists of zpcplO polypeptides, or zpeplO determine what function a particular gene might have, 

polypeptide fragments or soluble zpeplO receptors can be The present invention provides reagents for use in diag- 

formulated with pharmaceutically acceptable carriers for nostic applications. For example, the zpeplO gene, a probe 

parenteral, oral, nasal, rectal, topical, transdermal adminis- comprising zpeplO DNA or RNA, or a subsequence thereof 

tration or the like, according to conventional methods. 55 can be used to determine if the zpeplO gene is present on a 

Formulations may further include one or more diluents, particular chromosome or if a mutation has occurred, 

fillers, emulsifiers, preservatives, buffers, excipients, and the Detectable chromosomal aberrations at the zpeplO gene 

like, and may be provided in such forms as liquids, powders, locus include, but are not limited to, aneuploidy, gene copy 

emulsions, suppositories, liposomes, transdermal patches number changes, insertions, deletions, restriction site 

and tablets, for example. Slow or extended-release delivery 60 changes and rearrangements. These aberrations can occur 

systems, including any of a number of biopolymers within the coding sequence, within inlrons, or within llank- 

(biological-based systems), systems employing liposomes, ing sequences, including upstream promoter and regulatory 

and polymeric delivery systems, can also be utilized with the regions, and may be manifested as physical alterations 

compositions described herein to provide a continuous or within a coding sequence or changes in gene expression 

long-term source of the zpcplO polypeptide, agonist or 65 level. 

antagonist. Such slow release systems arc applicable to In general, these diagnostic methods comprise the steps of 

formulations, for example, for oral, topical and parenteral (a) obtaining a genetic sample from a patient; (b) incubating 
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the genetic sample with a polynucleotide probe or primer as 
disclosed above, under conditions wherein the polynucle- 
otide will hybridize to complementary polynucleotide 
sequence, to produce a first reaction product; and (iii) 
comparing the first reaction product to a coDtrol reaction 
product. A difTerence between the first reaction product and 
the control reaction product is indicative of a genetic abnor- 
mality in the patient. Genetic samples for use within the 
present invention include genomic DNA, cDNA, and RNA. 
The polynucleotide probe or primer can be RNA or DNA, 
and will comprise a portion of SEQ ID NO:l, the comple- 
ment of SEQ ID N0:1, or an RNA equivalent thereof. 
Suitable assay methods in this regard include molecular 
genetic techniques known to those in the art, such as 
restriction fragment length polymorphism (RFLP) analysis, 
short tandem repeat (STR) analysis employing PGR 
techniques, ligation chain reaction (Barany, PCR Methods 
and Applications 1:5-16, 1991), ribonuclease protection 
assays, and other genetic linkage analysis techniques known 
in the art (Sambrook et al., ibid.; Ausubel et. al., ibid.; 
Marian, Chest 108:255-65, 1995). Ribonuclease protection 
assays (see, e.g., Ausubel et al., ibid., ch. 4) comprise the 
hybridization of an RNA probe to a patient RNA sample, 
after which the reaction product (RNA-RNA hybrid) is 
exposed to RNasc. Hybridized regions of the RNA are 
protected from digestion. Within PCR assays, a patient's 
genetic sample is incubated with a pair of polynucleotide 
primers, and the region between the primers is amplified and 
recovered. Changes in size or amount of recovered product 
are indicative of mutations in the patient. Another PCR- 
based technique that can be employed is single strand 
conformational polymorphism (SSCP) analysis (Hayasbi, 
PCR Methods and Applications 1:34-8, 1991). 

Polynucleotides encoding zpeplO polypeptides are useful 
within gene therapy applications where it is desired to 
increase or inhibit zpeplO activity. If a mammal has a 
mutated or absent zpcplO gene, the zpcplO gene can be 
introduced into the cells of the mammal. In one 
embodiment, a gene encoding a zpeplO polypeptide is 
introduced in vivo in a viral vector. Such vectors include an 
attenuated or defective DNA virus, such as, but not limited 
to, herpes simplex virus (HSV), papillomavirus, Epstein 
Barr virus (EBV), adenovirus, adeno -associated virus 
(AAV), and the like. Defective viruses, which entirely or 
almost entirely lack viral genes, are preferred. A defective 
virus is not infective after introduction into a cell. Use of 
defective viral vectors allows for administration to cells in a 
specific, localized area, without concern that the vector can 
infect other cells. Examples of particular vectors include, but 
are not limited to, a defective herpes simplex virus 1 (HSVl) 
vector (Kaplitt ct al, Molec. Cell. NeuroscL 2:320-30, 
1991); an attenuated adenovirus vector, such as the vector 
described by Stratford-Perricaudet et al., J. Clin. Invest, 
90:626-30, 1992; and a defective adeno-associated virus 
vector (Samulski et al.,7. Virol. 61:3096-101, 1987; Sam- 
ulski et al., J. Virol. 63:3822-8, 1989). 

In another embodiment, a zpeplO gene can be introduced 
in a retroviral vector, e.g., as described in Anderson et al., 
U.S. Pat. No. 5,399,346; Mann et al. Cell 33:153, 1983; 
Temin et al, U.S. Pat. No. 4,650,764; Temin et al, U.S. Pat. 
No. 4,980,289; Markowitz et al, ViroL 62:1120, 1988; 
Temin el al, U.S. Pat. No. 5,124,263; International Patent 
Publication No. WO 95/07358, published Mar. 16, 1995 by 
Dougherty et al; and Kuo et al, Blood 82:845, 1993. 
Alternatively, the vector can be introduced by lipofcction in 
vivo using liposomes. Synthetic cationic lipids can be used 
to prepare liposomes for in vivo transfection of a gene 
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encoding a marker (Feigner et ^\.,Pwc. Natl. Acad. ScL USA 
84:7413-7, 1987; Mackey et al, Proc. Natl. Acad ScL USA 
85:8027-31, 1988). The use of lipofection to introduce 
exogenous genes into specific organs in vivo has certain 

5 practical advantages. Molecular targeting of liposomes to 
specific cells represents one area of benefit. More 
particularly, directing transfection to particular cells repre- 
sents one area of benefit. For instance, directing transfection 
to particular cell types would be particularly advantageous 

10 in a tissue with cellular heterogeneity, such as the pancreas, 
liver, kidney, and brain. Lipids may be chemically coupled 
to other molecules for the purpose of targeting. Targeted 
peptides (e.g., hormones or neurotransmitters), proteins such 
as antibodies, or non-peptide molecules can be coupled to 

15 liposomes chemically. 

It is possible to remove the target cells from the body; to 
introduce the vector as a naked DNA plasmid; and then to 
re-implant the transformed cells into the body. Naked DNA 
vectors for gene therapy can be introduced into the desired 

20 host cells by methods known in the art, e.g., transfection, 
electroporation, microinjection, transduction, cell fusion, 
DEAE dextran, calcium phosphate precipitation, use of a 
gene gun or use of a DNA vector transporter. See, e.g., Wu 
et al., J. Biol. Chem. 267:963-7, 1992; Wu et al., Biol. 

25 Chem. 263:14621^, 1988. 

Antisense methodology can be used to inhibit zpeplO 
gene translation, such as to inhibit cell proliferation in vivo. 
Polynucleotides that are complementary to a segment of a 
zpeplO-encoding polynucleotide (e.g., a polynucleotide as 

30 set froth in SEQ ID N0:1) are designed to bind to zpeplO- 
encoding mRNA and to inhibit translation of such mRNA. 
Such antisense polynucleotides are used to inhibit expres- 
sion of ZpeplO polypeptide-encoding genes in cell culture or 
in a subject. 

35 Transgenic mice, engineered to express the zpepl 0 gene, 
and mice that exhibit a complete absence of zpeplO gene 
function, referred to as "knockout mice" (Snouwacrt et al. 
Science 257:1083, 1992), may also be generated (Lowell ct 
al, Nature 366:740^2, 1993), These mice may be 

40 employed to study the zpeplO gene and the protein encoded 
thereby in an in vivo system. 

The invention is further illustrated by the following 
non-limiting examples. 

45 EXAMPLES 

Example 1 

Identification of ZpeplO 

50 The ZpeplO polypeptide-encoding polynucleotides of the 
present invention were initially identified by querying an 
EST database for polypeptides containing repetitive patterns 
and post-translationd processing sites yielding potentially 
active peptides. The polypeptide encoded by an EST meet- 

55 ing those search criteria was further analyzed and found to 
be a membrane glycoprotein. ITie EST sequence was from 
a testis cell library. Several clones considered likely to 
contain the entire coding region were used for sequencing 
and resulted in incompletely spliced messages. A minimal 

(,n nucleotide sequence having all potential introns spliced out 
was generated from these sequences. 

To obtain the complete cDNA sequence a human testis 
library was screened. The library was plated in pools of 
12,000. Plasmid DNA was prepared from the plated bacteria 

65 using a Qiagcn® plasmid purification column (Qiagen, Inc., 
Chatsworth, Cafif.) according to the manufacturer's instruc- 
tions. DNA from these pools were used as template DNA to 
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identify pools containing the DNA encoding zpeplO using length zpeplOpolypeptide was isolated. An iniron may be 

PCR. Oligonucleotide primers ZC16,186, (SEQ ID N0:9) contained within the 3' untranslated region from base pairs 

and ZC16,187, (SEQ ID N0:10) were designed from the 560-784 of SEQ ID NO:l. 
sequence of the EST. One nanogram of template DNA was 

combined with 20 pmoles of each primer in a PCR mixture. 5 Example 2 

The reaction mixture was incubated at 94° C. for 5 minutes, 
then run for 35 cycles of 94° C, 30 seconds and 68° C, 30 

seconds; followed by an extension at 68° C. for 7 minutes. "^^^ Distribution 

Pools having the correct sized PCR produce, 290 bp, were ^^^^^ ^^^^ ^^^^ ^^^^^^^^ ^^^^ j, 
used as a template for PCR isolation of t^he 5 end ^ MTN III; Clontech) were probed to determine the tissue 

clones. Sequence specific primer ZC16 186 SEQ ID NO: 9) distribution of human zpeplO expression. An approximately 

and vectorspecific pnmer ZC13 006 (SEQ ID N0:11) were .3^ entirely 3' UTR was derived by restriction 

used m PCR reactions as above PCR products were purified digest of the done described above with Not I and Eoo RI. 
by Qiaex II Gel Extraction Kit Qiagen, Inc.) accordmg to ^^^^^.^^.^^ ^ ^ 

manufacturer s instructions and sequenced. Pools which 15 . 1 , . j 1 r\- n 

. . , , . i. 11 »• J gel electrophoresis and punned usin^ Qiaex II (Qiaeen, 

contained the clones with the most fully spliced sequence %, ^ _«u /- 1 <'\ ^ . c . > * 

. ^ J ; . . . r™. Chatsworth, Calif.) according to manufacturer s instruc- 

were used to tranrform £. coh and plated to agar. The ^.^^^ radioaclivcly labeled usiog the MUL- 

colomes were transferred to mtrocc llulose and probed with tjprime DNA labeling kit (Amersham, Arlington Heights, 

a 290 bp fragment derived above. Ine probe was radio ac- tii\ j c \ > - . ♦ to- u 

1 i\_ 1 1 ' iL^iTTri^nniik^i^ TNKTA 1 u t' 1'. 111.) accordmg to the mauufacturer s lustructions. Thc pfobc 

.vely labeled using the MULTIPRIME DNA labeling kit 2n p^^rifi.dtsing a NUCTRAP push column (Stratagene) 

(Amersham. Arlington Heights. 111.) according to the manu- gXPRESSHYB (Clontech) solution was used for prehybrid- 

facturer s instructions. The probe was purified using a . . . il i_ -j- • i »- r vr *L ui . 

. , Vo. . ^ iT u ization and as a hybndizing solution for the Northern blots. 

NUCTRAP push column (Stratagene). ExpressHyb it u » i i u» * /:co a ♦u^ 

. V , . J )i . . J. 1 Hybridization took place overnight at 65 C. using and the 

(Clontech) solution was used for prehybridization and as a . .\ u i * zc^o ^ i oc/- cx i rv c^^c a 

> ' , . ^ . , I. -J- . , blots were then washed at ^0 C. in Ix SSC, 0.1% SDS. A 

hybridizmg solution for the colony lifts. Hybridization took 25 , ^ 1 1 * ^ ,^...^^««.,.r.,. *^ ^ :„ ♦^o«;o 

1 . ^fo ^ c I- • i 4 ./^6 I \ c 1.5 kb transcript corresponding to ZpeplO was seen in testis 

place at 65° C. for over 12 hours using 1.2x10^ cpm/ml of , ^ , ^ ^ • i • i 

f , . J . ^, . J X . and a non-discrete smear was seen in kidney, 

labeled probe. The filters were then washed 4 times at 5 

minutes each in 2x SSC, 0.005% SDS at 25° C. followed by A RNA Master Dot Blot (Clontech) that contained RNAs 

2 washes at 20 minutes each in 0.1 x SSC, 0.1% SDS at 50° from various tissues that were normalized to 8 housekeeping 
C. with continuous agitation. Plasmid DNA from those 30 genes was also probed and hybridized as described above, 

colonies producing a signal were isolated and sequenced. The highest level of expression was seen in testis with 

'ilie 1094 bp (SEQ ID N0:1) sequence encoding the a full signillcantly reduced expression in kidney. 



SEQUENCE LISTING 



<160> NUMBER OF SEQ ID NOS: 14 

<210> SEQ ID NO 1 

<211> LENGTH: 10 94 

<212> TYPE: DNA 

<213> ORGANISM: Homo sapiens 

<220> FEATURE: 

<221> NAME/KEY: CDS 

<2 22 > LOCATION: (79). ..(504) 

<400> SEQUENCE: 1 

ggcaagggct ggagccaggg ctgcagagca ttccttggct cagctggggc agcgccgccc 60 

catcccccag tggtcctc atg tgg agg ctg gca eta ggc ggg gtt ttc ctg 111 
Met Trp Arg Leu Ala Leu Gly Gly Val Phe Leu 
15 10 

gca gcc gcc cag get tgt gtc ttc tgt cgc etc cca gcc cac gac ttg 159 
Ala Ala Ala Gin Ala Cys Val Phe Cys Arg Leu Pro Ala His Asp Leu 

15 20 25 

tea ggc cgc ctg get egg etc tgc age cag atg gag gcc agg cag aag 207 
Ser Gly Arg Leu Ala Arg Leu Cys Ser Gin Met Glu Ala Arg Gin Lys 
30 35 40 

gaa tgt ggg gcc tec cca gac ttc teg gcc ttt gcc tta gat gag gtg 255 
Glu Cys Gly Ala Ser Pro Asp Phe Ser Ala Phe Ala Leu Asp Glu Val 
45 50 55 

tec atg aac aaa gtc aca gag aag act cac aga gtc ctg agg gtc atg 303 
Ser Met Asn Lys Val Thr Glu Lys Thr His Arg Val Leu Arg Val Met 
60 65 70 75 



ggg ggc age aec aeg ctg tac aac tgc tec ace tgc aag ggg acg gag 



351 
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-continued 



Gly Gly Ser Thr Thr Leu Tyr Aen Cys Ser Thr Cys Lys Gly Thr Glu 
80 85 90 

gtg tec tgc tgg ccc cga aag cgc tgc ttc cca gga agt cag gat ctt 399 
Val Ser Cys Trp Pro Arg Lys Arg Cys Phe Pro Gly Ser Gin Asp Leu 
95 100 105 

tgg gaa gcc aag att ctg etc etc tec ate ttc gga get ttc ctg ctt 447 
Trp Glu Ala Lys lie Leu Leu Leu Ser lie Phe Gly Ala Phe Leu Leu 
110 115 120 

ctg ggt gtt ctg age etc ctg gtg gag tec cac cac etc caa gca aaa 495 
Leu Gly Val Leu Ser Leu Leu Val Glu Ser His His Leu Gin Ala Lys 

125 130 135 

agt ggc ttg tgaagacgct gaaaacctcc cagcctccag ctetaagggg 544 

Ser Gly Leu 

140 

tatgcactca caacttccac atcccttgga ggggaaccag tcagcccctt agtcccagct 604 

ccaaagacag tctccagacc ctaaaaccca gacatccctg cttctggttg gtgagataat 664 

gaaaaacaag aaaatcccca aaaacccaga tcccccacaa tcccagtgtc agatggcctc 724 

ccgggaacec aggcacccac agetggaaag ttcctcccct ccagccctca accaatcaca 78 4 

tggctgtcaa caatgccagg aaaatatcta cagaaggaaa gaatccccta cgccactccc 844 

accacaccca cacccccttc tgcctgttcc gggaaagcgg gggcatctgc cccagaagct 904 

attccaggcc ctcctatgac tgatggggaa tccgggaatg catgttctgg aaaactcacc 964 

ccactagagt gagatcacat cagtgggttc gcgggcatgc cctccctcca tcgtgttaac 1024 

agtttgaaat cctggcctcc ctcagaggcc tccatcctgc caggcctaag taaaacttgc 1084 

tgttcatgga 1094 



<210> SEQ ID NO 2 

<211> LENGTH: 142 

<212> TYPE: PRT 

<213> ORGANISM: Homo sapiens 

<400> SEQUENCE: 2 

Met Trp Arg Leu Ala Leu Gly Gly Val Phe Leu Ala Ala Ala Gin Ala 
15 10 15 

Cys Val Phe Cys Arg Leu Pro Ala His Asp Leu Ser Gly Arg Leu Ala 
20 25 30 

Arg Leu Cys Ser Gin Met Glu Ala Arg Gin Lys Glu Cys Gly Ala Ser 

35 40 45 

Pro Asp Phe Ser Ala Phe Ala Leu Asp Glu Val Ser Met Asn Lys Val 
50 55 60 

Thr Glu Lys Thr His Arg Val Leu Arg Val Met Gly Gly Ser Thr Thr 
65 70 75 80 

Leu Tyr Asn Cys Ser Thr Cys Lys Gly Thr Glu Val Ser Cys Trp Pro 
85 90 95 

Arg Lys Arg Cys Phe Pro Gly Ser Gin Asp Leu Trp Glu Ala Lys lie 
100 105 110 

Leu Leu Leu Ser lie Phe Gly Ala Phe Leu Leu Leu Gly Val Leu Ser 
115 120 125 

Leu Leu Val Glu Ser His His Leu Gin Ala Lys Ser Gly Leu 
130 135 140 



<210> SEQ ID NO 3 
<211> LENGTH: 426 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
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<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate nucleotide sequence encoding the 

zpeplO polypeptide of SEQ ID NO: 2 
<221> NAME/KEY: variation 
<222> LOCATION: (!)...( 426) 

<223> OTHER INFORMATION: Each N is independently any nucleotide. 



<400> SEQUENCE: 3 



atgtggmgny 


tngcnytngg 


nggngtntty 


ytngcngcng 


cncargcntg 


ygtnttytgy 


60 


mgnytnccng 


cncaygayyt 


nwsnggnmgn 


ytngcnmgny 


tntgywsnca 


ratggargcn 


120 


mgncaraarg 


artgyggngc 


nwsnccngay 


ttywsngcnt 


tygcnytnga 


ygargtnwsn 


180 


atgaayaarg 


tnacngaraa 


racncaymgn 


gtnytnmgng 


tnatgggngg 


nwsnacnacn 


240 


ytntayaayt 


gywsnacntg 


yaarggnacn 


gargtnwsnt 


gytggccnmg 


naarmgntgy 


300 


ttyccnggnw 


sncargayyt 


ntgggargcn 


aarathytny 


tnytnwsnat 


httyggngcn 


360 


ttyytnytny 


tnggngtnyt 


nwsnytnytn 


gtngarwsnc 


aycayytnca 


rgcnaarwsn 


420 


ggnytn 












426 



<210> SEQ ID NO 4 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate oligonucleotide probe 
<221> NAME /KEY: variation 
<222> LOCATION: (1)...(17) 

<223> OTHER INFORMATION: Each N is independently any nucleotide 
<400> SEQUENCE: 4 

cargcntgyg tnttytg 17 



<210> SEQ ID NO 5 

<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate oligonucleotide probe 

<221> NAME/KEY: variation 
<222> LOCATION: (1)...(17) 

<223> OTHER INFORMATION: Each N is independently any nucleotide. 
<400> SEQUENCE: 5 

caraargart gyggngc 17 



<210> SEQ ID NO 6 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate oligonucleotide probe 
<221> NAME/KEY: variation 
<222> LOCATION : ( 1 ) . . . ( 17 ) 

<223> OTHER INFORMATION: Each N is independently any nucleotide. 

<400> SEQUENCE: 6 

atgaayaarg rnacnga 17 



<210> SEQ ID NO 7 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate oligonucleotide probe 
<221> NAME/KEY: variation 
<222> LOCATION: (!)...( 17) 
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<223> OTHER IMFORMATION : Each N is independently any nucleotide. 
<400> SEQUEHCE: 7 

gmacngara aracnca 17 

<210> SEQ ID NO 8 
<211> LENGTH: 17 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Degenerate oligonucleotide probe 

<221> NAME /KEY: variation 
<222> LOCATION: (1)...(17) 

<223> OTHER INFORMATION: Each N iB independently any nucleotide. 
<400> SEQUENCE! 8 

acntgyaarg gnacnga 17 



<210> SEQ ID NO 9 
<211> LENGTH: 25 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Oligonucleotide ZC16,186 
<400> SEQUENCE: 9 

atcagtcata ggagggcctg gaata 25 



<210> SEQ ID NO 10 

<211> LENGTH: 25 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Oligonucleotide ZC16,187 

<400> SEQUENCE: 10 

tccctgcttc tggttggtga gataa 25 



<210> SEQ ID NO 11 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Oligonucleotide ZC 13,006 
<400> SEQUENCE: 11 

ggctgtcctc taagcgtcac 20 



<210> SEQ ID NO 12 
<211> LENGTH: 12 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Nucleotide contig example 
<400> SEQUENCE: 12 

atggcttagc tt 12 



<210> SEQ ID NO 13 
<2H> LENGTH: 12 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION; Nucleotide contig example 



<400> SEQUENCE: 13 
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tagcttgagt ct 

<210> SEQ ID NO 14 

<211> LENGTH: 12 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Nucleotide contig example 

<400> SEQUENCE: 14 
gtcgactacc ga 



What is claimed is: 

1. An isolated polypeptide comprising an extracellular 
domain, wherein said extracellular domain comprises amino 
acid residues 22 to 111 of the amino acid sequence of SEQ 
ID N0:2. 

2. An isolated polypeptide according to claim 1, wherein 
said polypeptide further comprises a transmembrane domain 
that resides in a carboxyl-terminal position relative to said 
extracellular domain, wherein said transmembrane domain 
comprises amino acid residues 112 to 133 of the amino acid 
sequence of SEQ ID N0:2. 

3. An isolated polypeptide according to claim 2, wherein 
said polypeptide further comprises a cytoplasmic domain 
that resides in a carboxyl-terminal position relative to said 
transmembrane domain, wherein said cytoplasmic domain 
comprises amino acid residues 134 to 142 of the amino acid 
sequence of SEQ ID N0:2. ^ 

4. An isolated polypeptide according to claim 2, wherein 
said polypeptide further comprises a secretory signal that 
resides in an amino-terminal position relative to said extra- 
cellular domain, wherein said secretory signal sequence 
comprises amino acid residues 1 to 20 of the amino acid 
sequence of SEQ ID NO: 2. 

5. An isolated polypeptide according to claim 1 compris- 
ing amino acid residue 1 to amino acid residue 142 of SEQ 
ID NO:2. 

6. An isolated polypeptide according to claim 1, 
covalently linked amino terminally or carboxy terminally to 
a moiety selected from the group consisting of aflBnity tags, 
toxins, radionucleotides, enzymes and fluorophores. 

7. An isolated polypeptide comprising a sequence of 
amino acid residues that is at least 80% identical to a amino 
acid residue 21 to amino acid residue 142 of SEQ ID N0:2, 
wherein said polypeptide specifically binds with an antibody 
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that specifically binds with a polypeptide having the amino 
acid sequence of SEQ ID N0:2. 

8. An isolated polypeptide according to claim 7, wherein 
any difference between said amino acid sequence of said 
isolated polypeptide and said corresponding amino acid 
sequence of SEQ ID NO: 2 is due to a conservative amino 
acid substitution. 

9. An isolated polypeptide of claim 7, wherein the amino 
acid percent identity is determined using a FASTA program 
with ktup=l, gap opening penalty=10, gap extension 
penalty=l, and substitution malrixs'blosum62, with other 
parameters set as default. 

10. An isolated polypeptide comprising the amino acid 
sequence of amino acid residue 1 to amino acid residue 20 
of SEQ ID N0:2. 

11. An isolated polypeptide selected from the group 
consisting of: 

a) amino acid residues 21-111 of SEQ ID N0:2; 

b) amino acid residues 112-133 of SEQ ID N0:2; 

c) amino acid residues 134-142 of SEQ ID N0:2; 

d) amino acid residues 1-20 of SEQ ID N0:2; 

e) amino acid residues 21-133 of SEQ ID N0:2; 

f) amino acid residues 112-142 of SEQ ID N0:2; 

g) amino acid residues 1-111 of SEQ ID N0:2; and 

h) amino acid residues 1-133 of SEQ ID N0:2. 

12. A fusion protein consisting of a first portion and a 
second portion joined by a peptide bond, said first portion 
comprising a polypeptide according to claim 1, and 

said second portion comprising another polypeptide. 

13. A polypeptide according to claim 1 in combination 
with a pharmaceutically acceptable vehicle. 

* * * * * 
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Fig1(1 ) 

I tacaatgqgq tggc<i9aggt gaag aaacgg ggttacttct atgctagaac gcaaggaaca 13ftf> 
yng vae vkkr gyf yar t 
yn^ vae vnte rga ngct 1 

61 taaaaaaatg Utaaaagcg gtaaaaattg ggcagtcgtt acactctcga ctgctgcgct 
1 uyksgknwavvtlstaa 

121 ggtacttggt gcaacaactg taaatgcatc egcggacaca aacattgaaa acaatgattc 
iBlvfg att vna sadt nie nnd 

IBl tcctactgta caagctacaa caggtgataa tgatattgct gttaaaagtg tgacacttgg 
assstv qvt tgd ndia vks vtl 

241 tagtggtcaa gttagtgcag ctagtgatac gactattaga acttctgcta atgcaaacag 
SSgagq vsa asd tcir tsa nan 

301 tgcttcttet gcegctaaca cacaaaattc uaeagtcaa gtageaagtt ctgctgcaat 
7B6aB8 aan tqn snsg vas saa 

3(1 aacatcacct acaagttccg cagcttcatt aaataacaca gaugtaaag cggctcaaga 
98itS8 tss aaa Innt dsk aag 

421 aaacactaat acagocaaaa atgatgacac geaaaaagct gcaccagcta aogaaccttc 
llBentn tak ndd tqka apa nes 

481 tgaagctaaa aatgaaccag ctgtaaacgt taatgattct ccagctgcaa aaaatgatga 
ueseak nep avn vads saa knd 

541 tcaacaakcc agtaaaaaga atactaccgc caagttaaac aaggatgctg aaaacgttgt 
ISSdqqs akk ntt akin kda env 

(01 aaaaaaggcg ggaattgatc etaacagttt aactgatgac cagattaaag cattaaacaa 
178vkka gid pus ltdd qik aln 
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Fig1(2) 

661 gatgaaettc tcgaaagctg caaagtctgg tacawaatg actutaatg atttccaaaa 
198 k n a £ ska a k a g t q m t y n d f q 

721 gattgctgat T,ttaatai aaeaaaatgg tcggt acaca gttccattct ttaaagcaag 20f tfi < 
218k i a d t 1 i k q d g r y t v p £ f k a 

181 tgaaatcaaa aatatgcctg ccgctacaac taaagatgca caaactaata ctattgaacc 
238 8 a i k nop a a t t k d a q t n tie 

841 ttugatgta tgggattcat ggccagttca agatgttcgg acaggacaag ttgctaattg 5f tf > 
258 p 1 d y « d 8 w p V q d v r t g q van ftftfi < 

901 gaatggcut caacttgtca tcgcaatgat gggaattcca aaccaaaatg ataatcatat 
278wngy qlv ian agip nqn dab 

961 ctatctctca cataataagt atggtgataa tgaatbaagt cattgga aga atgtaggtce 7£c£ > 
298 iyll yak ygd nels hvk nvg 

1021 aatttttgge tataattcta cogcggttte aoaarotm tcamatcaa ctgttttaa a 7ft£ > 
318pl£g y&8 Cav sqev sgs avl $ft£i < 



1081 cagtgataac tctatccaat tattttatac aagggtagac acgtctgata acaataccaa 
338 a8dQ 8iq l£y trvd tsd nnc 



1141 tcatcaaaaa at tgctagcg ctactcttca ttta actgat aataatggaa atgtateac t Nhel 
358 nliqk ias atl yltd nog nvs ACl(i)c> 

1201 cgctcaggta egaaatgaet atattgtatt tgaaggtgat q getattaet accaaactta AC2(i}<> 
378 laqv rod y'iv fegd gyy yqt 



1261 tgatoaatg g aaagctacta acaaaggtgc cgataatabt gcaatgcgtg atgctcatgt 
398 ydqv kat nkg ad&i ami dah 
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Fig 1(3 ) 

1321 aattgaagac ggtaatggtg atcggtacct tgtttttgaa gcaagtactg gtttggaaaa 
418 vied gng dry Ivfe ast gle 

1381 ctatcaaggc gaggaccaaa tttataactg gttaaattat ggcggagatg acgcatttaa 
43etiyqg e.-dq lyn vlny ggd daf 

1441 taccaagagc ttatttagaa ttctttccaa tgatgatatt aagagtcggg caacttgggc 
4S8nlka Ifr lis n.ddi ksr atw 

1501 taatgcagct atcggtatcc tcaaactaaa taaggacgaa aagaatccta aggtggcaga 
478 aaaa igi Ikl nkde kap kva 

1561 gctacactca ccattaacct ctgcaccaat ggtaagcgac gaaattgagc gaccaaatgt 
49aelyfi pli sap mvad eie rpn 

1621 agttaaatta ggtaataaat attacttact tgccgctacc cgtttaaatc gaggaagtaa 
SlSvvkl gnk yyl faat rln rgs 

1681 tgatgatgct tggatgaatg ctaattatgc ogtcggtgat aatgttgcaa t^tcggata 
538 n dda van any avgd ava nvg 

1741 tgttgctgat agtctaacbg gatccutaa gccattaaat gattctggag tagtcttgac 
SSayvad sit gay kpln dag vvl 

1801 tgcttctgtt cctgcaaact gjoggacagc aacttattca tattatgctg tccccgttgc 
578 ta8v pan vrC atys yya vpv 

1861 cggaaaagat gaccaagtat tagttacttc atatatgacc aatagaaatg gagtagcggg 
SSBagkd dqv Ivt syot nrn gva 



1921 taaaggaatg ga tteaactt gggcaccgag tttctta cta caaattaacc cggacaacac 12ftf i < 
6l8gkgm dst vap sfll qin pdn 
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Figl (4 ) 



1981 aactaccgtt ttagctaaaa tgactaatca aggggattgg atttgggatg acccaagcga 
fiascttv lak mta qgclw iwd dss 

2041 aaaccttgat atgattggtg atCtagactc cgctgcttta cctggcgaac gtgataaacc 
essenld nig did saal pge rdk 

2101 tgttgactgg gacitaattg gttatggatt aaaaccgcat gatcctgcta caccaaatga 
678 pvdw dli gyg Ik ph dpa tpn 

21(1 tcctgaaacg ccaactacac cagaaacccc tgagacacct aatactccca aaacaccaaa 
638 dpet ptt pet petp ntp kbp 

2221 gactcctgaa aatcctggga cacctcaaac tcctaataca cctaacaccc cggaaattcc 
716 ktpe npq tpq tpnt pnt pel 

2281 tbtaactcca gaaacgccta agcaacctga aacccaaact aataatcgtc tgccacaaac 
738 plcp etp kqp et qt nnr Ipq 

2341 tggaaataat gccaataaag ocatgattgg cctaggcatg ggaacattgc tCagtatgtt 
7S8tgnn aiik anl glgm qtl Isn 

2401 tggtcttgca gaaattaaca aaGgtcgatt taactaaata ctttaaaata aaaccgctaa 
778 fgla ein krr fn 

2461 gccttaaatt cagcttaacg gctttttatt ttaaaagttt ttattgtaaa aaagcgaatt 
2521 atcattaata ctaatgcaat cgttgtaaga ccttacgaca gtagtaacaa tgaatttgcc 
2561 catctttgCc gg 
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Fig 3 

Hk N-tenioai sequence of FITS (levansucrase): 
(A)QVESNNYNGVAEVNTERQANGQI(Q)(V)(D). 



1^ 


nil 




1 



. (M)(A)HLDVWDSWPVQDP(V), 

. NAGSIFGT(K), 

» V(E)(E)yYSPKVSTLMASDEVE 
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FRUCTOSYLTRANSFERASES 

CROSS-REFERENCE TO RELATED 
APPLICAnON 

This application is a continuation-in-part of application of 
U.S. application Sen No. 09/604,958 filed on Jun. 28, 2000, 
now U.S. Pat. No. 6,635,460, which claims priority from 
European Application No. 00201872.9 filed on May 25, 

2000. 

The present invention is in the field of enzymatic produc- 
tion of biomolecules. The invention is particularly con- 
cerned with two novel fructosyltransferases derived from 
lactobacilli and with a process for recombinant production 
of the enzymes and for the production of useful levans, 
inulins and fructo-oligosaccharides &om sucrose. 

BACKGROUND OF THE INVENTION 

Lactic acid bacteria (LAB) play an important role in the 
fermentative production of food and feed. Traditionally, 
these bacteria have been used for the production of for 
instance vwne, beer, bread, cheese and yoghurt, and for the 
preservation of food and feed, e.g. olives, pickles, sausages, 
sauerkraut and silage. Because of these traditional 
applications, lactic acid bacteria are food-grade micro- 
organisms that posses the Generally Recognised As Safe 
(GRAS) status. Due to the different products which are 
formed during fermentation with lactic acid bacteria, these 
bacteria contribute positively to the taste, smell and preser- 
vation of the final product. The group of lactic acid bacteria 
encloses several genera such as Lactobacillus, Leuconostoc, 
Pediococcus, Streptococcus, etc. 

In recent years also the health promoting properties of 
lactic acid bacteria have received much attention. They 
produce an abundant variety of exopolysaccharides (EPS's). 
These polysaccharides are thought to contribute to human 
heahh by acting as prebiotic substrates, outraceuticals, cho- 
lesterol lowering agents or immunomodulants. 

To date high molecular weight polysaccharides produced 
by plants (such as cellulose, starch and pectin), seaweeds 
(such as alginate and carrageenan) and bacteria (such as 
alginate, gellan and xanthan) are used in several industrial 
applications as viscosifying, stabilising, emulsifying, gelling 
or water binding agents. Although all these polysaccharides 
are used as food additives, they originate from organisms not 
having the GRAS status. Thus they are less desirable than 
the exopolysaccharides of microorganisms, such as lactic 
acid bacteria, which have the GRAS status. 

The exopolysaccharides produced by LAB can be divided 
in two groups, hctcropolysaccharides and homopolysaccha- 
rides; these are synthesized by totally different mechanisms. 
The former consist of repeating units in which residues of 
different types of sugars are present and the latter consist of 
one type of monosaccharide. The synthesis of het- 
eropolysaccharides by lactic acid bacteria, including 
lactobacilli, has been studied extensively in recent years. 
Considerably less information is available on the synthesis 
of homopolysaccharides from lactobacilli, although some 
studies have been performed. Homopolysaccharides with 
fructose as the constituent sugar can be divided into two 
groups, inulins and levans. InuUns consist of 2,1-linked 
p-fructofuranoside residues, whereas levans consist of 2,6- 
linked p-fructofuranoside residues. Both can be linear or 
branched. The size of bacterial levans can vary from 20 kDa 
up to several MDa. There is limited information on the 
synthesis of levans. In most detail this synthesis has been 
studied in Zymomonas mobilis and in Bacillus species. 
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Within lactic acid bacteria, fructosyltransferases have only 
been studied in streptococci. So far no fructosyltransferases 
have been reported in lactobacilli. 

In a recent report the Lactobacillus reuteri strain LB 121 

5 was found to produce both a glucan and a fructan when 
grown on sucrose, but only a fructan when grown on 
rafifinose (van Geel-Schutten, G. H. et al., Appl. Microbiol. 
Biotechnol. (1998) 50, 697-703). In another report the 
glucan and fhictan were characterised by their molecular 

10 weights (of 3,500 and 150 kDa respectively) and the glucan 
was reported to be highly branched with a unique structure 
consisting of a terminal, 4-substituted, 6-substituted, and 
4,6-disubstituted a-glucose in a molar ratio 1.1: 2.7:1.5:1.0 
(van Geel-Schutten, G. H. et al., Appl. Environ. MicTobiol. 

15 (1999) 65, 3008-3014). llie fructan was identified as a 
linear (2->6)-P-D-fructofuranan (also called a levan). This 
was the first example of fructan synthesis by a Lactobacillus 
species. 

20 SUMMARY OF THE INVENTION 

Two novel genes encoding enzymes having fructosyl- 
transferase activity have now been found in Lactobacillus 
reuteri^ and their amino acid sequences have been deter- 
mined. These are the first two enzymes identified in a 

25 Lactobacillus species capable of producing a fructan. One of 
the enzymes is an inulosucrase which produces a high 
molecular weight {>\{f Da) Iruclan containing p(2-l) linked 
fructosyl units and fructo-oligosaccharides, while the other 
is a levansucrasc which produces a fructan containing |3(2-6) 

30 linked fructosyl units. The invention thus pertains to the 
enzymes, to DNA encoding them, to recombinant cells 
containing such DNA and to their tise in producing 
carbohydrates, as defined in the appending claims. 

DESCRIPTION OF THE INVENTION 

It was found according to the invention that one of the 
novel fructosyltransferases (FTFA; an inulosucrase) pro- 
duces a high molecular weight inulin with P(2-l) linked 
fructosyl units and fructo-oligosaccharides. The firucto- 

40 oligosaccharides synthesis was also observed in certain 
I^ctobacillus strains, in particular in certain strains of Lac- 
tobacillus reuteri. However, the inulin has not been found in 

. Lactobacillus reuteri culture supcrnatants, but only in 
extracts of E. coli cells expressing the above-mentioned 

45 fiructosy Itransfe rase . This inulosucrase consists of either 798 
amino acids (2394 nucleotides) or 789 amino acids (2367 
nucleotides) depending on the potential start codon used. 
The molecular weight (MW) deduced of the amino acid 
sequence of the latter form is 86 kDa and its isoelectric point 

50 is 4.51, at pH 7. 

The amino acid sequence of the inulosucrase is shown in 
SEQ ID No. 1 (RG. 1, amino acid residues 1-789). As 
mentioned above, the nucleotide sequence contains two 
putative start codons leading to either a 2394 (see SEQ ID 

55 No. 3) or 2367 (see SEQ ID No. 2) nucleotide form of the 
inuhnsucrase. Both putative start codons are preceded by a 
putative ribosome binding site, GGGG (located 12 base 
pairs upstream its start codon) or AGGA (located 14 base 
pairs upstream its start codon), respectively (see FIG. 1 and 

60 SEQ ID No. 4). 

ITie present invention covers a protein having inulosu- 
crase activity with an amino acid identity of at least 65%, 
preferably at least 75%, and more preferably at least 85%, 
compared to the amino acid sequence of SEQ ID No. 1. The 

65 invention also covers a part of a protein with at least 15 
contiguous amino acids which are identical to the corre- 
sponding part of the amino acid sequence of SEQ ID No. 1. 
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Fructosyltransfe rases have been found in several bacteria oligosaccharides. Two different types of fructans, inulins 
such as Zymomonas mobilis, Erwinia amylovora, Aceto- and levans, exist in nature. Surprisingly, the novel inulosu- 
bacter amylovora, Bacillus polymyxa^ Bacillus erase expressed in E. coli lop 10 cell synthesizes a high 
amyloliquefaciens, Bacillus stearothermophiliis, 2Lnd Bacil- molecular weight (>10^ Da) inulin and fructo- 
lus subtilis. In lactic acid bacteria this type of enzyme 5 oligosaccharides, while in Lactobacillus reuteri culture 
previously has only been found in some streptococci. Most supematants, in addition to the fructo-oligosaccharides, a 
bacterial fructosyltransferases have a molecular mass of levan and not an inulin is found. This discrepancy can have 
50-100 kDa (with the exception of the fructosyllransferasc several explanations: the inulosucrase gene may be silent in 
found in Streptococcus salivarius which has a molecular Lactobacillus reuteri, or may not be expressed in Lactoba- 
mass of 140 kDa). Amino acid sequence alignment revealed cillits reuteri under the condilioas tested, or the inulosucrase 
that the novel inulosucrase of lactobacilh has high homology may only synthesize fructo-oligosaccharides in its natural 
with fructosyltransferases originating from Gram positive host, or the inulin polymer may be degraded shortly after 
bacteria, in particular with Streptococcus enzymes. The synthesis, or may not be secreted and remains cell- 
highest homology (FIG. 2) was found with the SacB enzyme associated, or the inulosucrase may have different activities 
oi Streptococcus mutans Ingbritt A(62% identity within 539 in Lactobacillus reuteri and E, coli ToplO cells- 
amino acids). It was furthermore foimd according to the invention that 

Certain putative functions based on the alignment and certain lactobacilli, in particular Lactobacillus reuteri, pos- 

site-directed mutagenesis studies can be ascribed to several sess another fructosyltransfe rase, a levansucrase (KJhb), in 

amino acids of the novel inulosucrase. Asp-263, Glu-330, addition to the inulosucrase described above. The 

Asp-415, Glu-431, Asp-511, Glu-514, Arg-532 and/or Asp- 20 N-terminal amino acid sequence of the fructosyltransferase 

551 of the amino acid sequence of SEQ ID No. 1 are purified from Lactobacillus reuteri supernatant was found to 

identified as putative catalytic residues. Noteworthy, a be QVESNNYNGVAEVNTERQANGQI (residues 2-24 of 

hydrophobicity plot according to Kyte and Doolittle (1982) SEQ ID No. 6), Furthermore, three internal sequences were 

J. Mol. Biol. 157, 105-132 suggests that the novel inulosu- identified, namely (M)(A)HLDVWDSWPVQDP(V) (SEQ 

erase contains a putative signal sequence according to the 35 ID No. 7), NAGSIFGT(K) (SEQ ID No. 8), V(E) (E) 

Von Heijnc rule. The putative signal peptidase site is located " VYSPKVSTLMASDEVE (SEQ ID No. 9). The N-terminal 

between Gly at position 21 and Ala at position 22. amino acid sequence could not be identified in the deduced 

Furthermore, it is striking that the C-terminal amino acid inulosucrase sequence. Also the amino acid sequences of the 

sequence of the novel inulosucrase contains a putative cell three internal peptide fragments of the purified fructosyl- 

wall anchor amino acid signal LPXTG (SEQ ID No. 5) and 33 transferase were not present in the putative inulosucrase 

a 20-fold repeat of the motif PXX (residues 690-749 of SEQ sequence. Evidently, the inulosucrase gene does not encode 

ID NO: 1) (see figure 1.), where P is proline and X is any the purified fructosyltransferase synthesizing the levan. The 

other amino acid. In 15 out of 20 repeats, however, the motif complete amino acid sequence of the levansucrase is shown 

is PXT. This motif has so far not been reported in proteins in SEQ ID No. 11 and the nucleotide sequence is shown in 

of prokaryotic and eukaryotic origin. 35 SEQ ID No. 10. The levansucrase comprises a putative 

A nucleotide sequence encoding any of the above men- membrane anchor (sec amino acids 761-765 in SEQ ID No. 

lioned proteins, mutants, variants or parts thereof is also a 11) and a putative membrane spanning domain (see amino 

subject of the invention. Furthermore, the nucleic acid acids 766-787 in SEQ ID No. 11). The fructan produced by 

sequences corresponding to expression-regulating regions the levansucrase was identified in the Lactobacillus reuteri 

(promoters,enhancers,terminators)of at least 30 contiguous 40 culture supernatant as a linear (2-^6)-|3-D-fructofuranan 

nucleic acids contained in the nucleic acid sequence (-67)- with a molecular weight of 150 kDa. The purified enzyme 

(-1) or 2367-2525 of SEQ ID No. 4 (see also FIG. 1) can be also produces this fructan. 

used for homologous or heterologous expression of genes. Additionally, the invention thus covers a protein having 

Such expression-regulating sequences are operationally levansucrase activity with an amino acid identity of at least 

linked to a polypeptide-encoding nucleic acid sequence such 45 65%, preferably at least 75%, and more preferably at least 

as the genes of the fructosyltransferase according to the 85%, compared to the amino acid sequence of SEQ ID NO. 

invention. A nucleic acid construct comprising the nude- 11. The second novel fructosyltransferase produces a high 

otide sequence operationally linked to an expression- molecular weight fructan with |3(2-6) linked fructosyl units 

regulating nucleic acid sequence is also covered by the with sucrose or rafi&nose as substrate. The invention also 

invention. 50 covers a part of a protein with least 15 contiguous amino 

A recombinant host cell, such as a mammalian (with the acids, which are identical to the corresponding part of the 

exception of human), plant, animal, fungal or bacterial cell, amino acid sequence of SEQ ID No. 11. A nucleotide 

containing one or more copies of the nucleic acid construct sequence encoding any of the above-mentioned proteins, 

mentioned above is an additional subject of the invention. mutants, variants or parts thereof is a subject of the invention 

The inulosucrase gene (starting at nucleotide 41) has been 55 as well as a nucleic acid construct comprising the nucleotide 

cloned in an E. coli expression vector under the control of sequence mentioned above operationally linked to an 

an ara promoter in £. coli ToplO. E. coli ToplO cells expression-regulating nucleic acid sequence. A recombinant 

expressing the recombinant inulosucrase hydrolysed sucrose host cell, such as a mammalian (with the exception of 

and synthesized fructan material SDS-PAGE of arabinose human), plant, animal, fungal or bacterial cell, containing 

induced £, coli ToplO cell extracts suggested that the 60 one or more copies of the nucleic acid construct mentioned 

recombinant inulosucrase has a molecular weight of 80-100 above is an additional subject of the invention. The inven- 

kDa, which is in the range of other known fructosyltrans- tion further covers a protein according to the invention 

ferases and in line with the molecular weight of 86 kDa which, in the presence of sucrose, produces a fructan having 

deduced of the amino acid sequence depicted in FIG. 1. p(2-6)-linked D-fructosyl units. 

The invention further covers an inulosucrase according to 65 The invention also pertains to a process of producing an 

the invention which, in the presence of sucrose, produces a inulin-type and/or a levan-type of fructan as described above 

inulin having p(2-l)-linked D-fructosyl units and fructo- using fructosyltransferases according to the invention and a 
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suitable fructose source such as sucrose, stachyose or raflS- 
nose. The fructaas may either be produced by Lactobacillus 
strains or recombinant host cells according to the invention 
containing one or both fructosyl transferases or by a fucto- 
syltransferase enzyme isolated by conventional means from 
the culture of fructosyltransfe rase -positive lactobacilli, 
especially a Lactobacillus reuteri, or from a recombinant 
organism containing the fructosyltransferase gene or genes. 

Additionally, the invention concerns a process of produc- 
ing fructo-oligosaccharides containing the characteristic 
structure of the fructans described above using a Lactoba- 
cillus strain or a recombinant host cell according to the 
invention containing one or both fructosyltransferases or an 
isolated fructosyltransferase according to the invention. 
There is a growing interest in oligosaccharides derived from 
homopolysaccharides, for instance for prcbiotic purposes. 
Several fructo- and gluco-oligosaccharides are known to 
stimiilate the growth of bifidobacteria in the human colon. 
Fructo-oligosaccharides produced by the fructosyltrans- 
ferase described above are also part of the invention. 
Another way of producing fructo-oligosaccharides is by 
hydrolysis of the fructans described above. This hydrolysis 
can be performed by known hydrolysis methods such as 
enzymatic hydrolysis with enzymes such as levanase or 
inulinase or by acid hydrolysis. The fructo-oligosaccharides 
can also be produced in the presence of a fructosyltrans- 
ferase according to the invention and an acceptor molecule 
such as lactose or maltose. The fructo-oligosaccharides to be 
produced according to the invention prefarably contain at 
least 2, more preferably at least 3, up to about 20 anhydrof- 
ructose units, optionally in addition to one or more other 
(glucose, galactose, etc.) units. These fructo- 
ohgosaccha rides are useful as prebiotics, and can be admin- 
istered to a mammal in need of improving the bacterial status 
of the colon. 

The invention also concerns chemically modified fructans 
and fructo-oligosaccharides based on the fructans described 
above. Chemical modification can be achieved by oxidation, 
such as hypochlorite oxidation resulting in ring-opened 
2,3-dicarboxy-anhydrofructosc units (see e.g. EP-A- 
427349), periodate oxidation resulting in ring-opened 3,4- 
dialdehyde-anhydrofructose units (see e.g. WO 95/12619), 
which can be further oxidised to (partly) carboxylated units 
(see e.g. WO 00/26257), TEMPO -mediated oxidation result- 
ing in 1- or 6-carboxy-anhydrotructose units (see e.g. WO 
95/07303). The oxidised fructans have improved water- 
solubility, altered viscosity and a retarded fcrmcntability and 
can be used as metal-complexing agents, detergent 
additives, strengthening additives, bioactive carbohydrates, 
emulsifiers and water binding agents. They can also be used 
as starting materials for further derivatisation such as cross- 
linking and the introduction of hydrophobes. Oxidised fruc- 
tans coupled to amino compounds such as proteins, or fatty 
acids can be used as emulsifiers and stabilizers. (Partial) 
hydrolysis of fructans according to the invention and modi- 
fied fructans according to the invention results in fructo- 
oligosaccharides, which can be used as bioactive carbohy- 
drates or prebiotics. llie oxidised fructans of the invention 
preferably contain 0.05-1.0 carboxyl groups per anhydrof- 
ructosc unit, e.g. as 6- or 1 -carboxyl units. 

Another type of chemical modification is 
phosphorylation, as described in O.B. Wurzburg (1986) 
Modified Starches: properties and uses. CRC Press Inc., 
Boca Raton, 97-112. One way to achieve this modification 
is by dry heating fructans with a mixture of monosodium and 
disodium hydrogen phosphate or with tripolyphosphate. The 
phosphorylated fructans are suitable as wet-end additives in 
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papermaking, as binders in paper coating compositions, as 
warp sizing-agenls, and as core binders for sand molds for 
metal casting. A further type of derivatisation of the fructans 
is acylation, especially acetylation using acetic or propionic 

5 anhydride, resulting in products suitable as bleaching assis- 
tants and for the use in foils. Acylation with e.g. alkenyl 
succinic anhydrides or (activated) fatty acids results in 
surface-active products suitable as e.g. surfactants, 
emulsifiers, and stabilizers. 

10 Hydroxyalkylation, carboxymethylation, and aminoalky- 
lation are other methods of chemical derivatisation of the 
fructans. Hydroxyalkylation is commonly performed by 
base-catalysed reaction with alkylene oxides, such as eth- 
ylene oxide, propylene oxide or epichlorohydrine; the 

15 hydroxyalkylated products have improved solubility and 
viscosity characteristics. Carboxymethylation is achieved by 
reaction of the fructans with mono-chloroacetic acid or its 
alkali metal salts and results in anionic polymers suitable for 
various purposes including crystallisation inhibitors, and 

20 metal complexants, Amino-alkylation can be achieved by 
reaction of the fructans with alkylene imines, haloalkyl 
amines or amino-alkylene oxides, or by reaction of epichlo- 
rohydrine adducts of the fructans with suitable amines. 
These products can be used as cationic polymers in a variety 

25 of applications, especially as a wet-end additive in paper 
making to increase strength, for filler and fines retention, and 
to improve the drainage rate of paper pulp. Other potential 
applications include textile sizing and wastewater purifica- 
tion. The above mentioned modifications can be used either 

30 separately or in combination depending on the desired 
product. Furthermore, the degree of chemical modification is 
variable and depends on the intended use. If necessary 100% 
modification, i.e. modification of all anhydrofructosc units 
can be performed. However, partial modification, e.g. from 

35 1 modified anhydrofructose unit per 100 up to higher levels, 
will often be sufiBcient in order to obtain the desired effect. 
The modified fructans have a DP (degree of polymerisation) 
of at least 100, preferably at leitst 1000 units. 

Use of a Lactobacillus strain capable of producing a 

40 levan, inulin or fhicto-oligosaccharides or a mixture thereof, 
as a probiotic, is also covered by the invention. Preferably, 
the Lactobacillus strain is also capable of producing a 
glucan, especially an 1,4/1,6-a-glucan as referred to above. 
The efficacy of some Lactobacillus reuteri strains as a 

45 prebiotics has been demonstrated in various animals such as 
for instance poultry and humans. The administration of some 
Lactobacillus reuteri strains to pigs resulted in significantly 
lower serum total and LDL-cholesterol levels, while in 
children Lactobacillus reuteri is used as a therapeutic agent 

50 against acute diarrhea. For this and other reasons Lactoba- 
cillus reuteri strains, which were not reported to produce the 
glucans or fructans described herein, have been supple- 
mented to commercially available probiotic products. The 
mode of action oi Lactobacillus reuteri as a probiotic is still 

55 unclear. Preliminary studies indicated that gut colonization 
by Lactobacillus reuteri may be of importance. According to 
the invention, it was found that the mode of action of 
Lactobacillus reuteri as a probiotic may reside partly in the 
ability to produce polysaccharides. Lactobacillus strains, 

60 preferably Lactobacillus reuteri strains, and more preferably 
Lactobacillus reuteri strain LB 121 and other strains con- 
taining one or more fructosyltransferase genes encoding 
proteins capable of producing inulins, levans and/or fructo- 
oligosaccharides can thus advantageously be used as a 

65 probiotic. They can also, together with these 
polysaccharides, be used as a symbiotic (instead of the term 
symbiotic, the term synbiotic can also be used). In that 
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respect another part of the invention concerns a probiotic or 
symbiotic composition containing a Lactobacillus strain 
capable of producing an inulin, a levan or fructo- 
oligosaccharides and/or a glucan or a mixture thereof, said 
production being performed according to the process 5 
according to the invention. ITie probiotic or symbiotic 
compositions of the invention may be directly ingested with 
or without a suitable vehicle or used as an additive in 
conjunction with foods. They can be incorporated into a 
variety of foods and beverages including, but not limited to, lO 
yoghurts, ice creams, cheeses, baked products such as bread, 
biscuits and cakes, dairy and dairy substitute foods, confec- 
tionery products, edible oil compositions, spreads, breakfast 
cereals, juices and the like. 

Furthermore, the invention pertains to a process of 
improving the microbial status in the mammalian colon 
comprising administering an effective amount of a Lacto- 
bacillus strain capable of producing an oligosaccharide or 
polysaccharide according ot the invention and to a process 
of improving the microbial status of the mammalian colon 
comprising administering an effective amount of an oli- 
gosaccharide or polysaccharide produced according to the 
process according ot the invention. 

EXAMPLES 

EXAMPLE 1 

Isolation of DNA from Lactobacillus reiiteri Nucleotide 
Sequence Analysis of the Inulosucrasc (ftfA) Gene, Con- 
struction of Plasmids for Expression of the Inulosucrasc 
Gene in E. coli ToplO Expression of the Inulosucrasc gene 
in E, coli ToplO and Identification of the Produced Polysac- 
charides Produced by the Recombinant Enzyme. 

General procedures for cloning, DNA manipulations and 
agarose gel electrophoresis were essentially as described by 
Sambrook et al. (1989) Molecular cloning: a laboratory 
manual, 2nd ed. Cold Spring Harbour Laboratory Press, 
Cold Spring Harbour, N.Y Restriction endonuclease diges- 
tions and ligations with T4 DNA ligase were performed as 
recommended by the suppliers. DNA was amplified by PCR 
techniques using ampliTAQ DNA polymerase (Perkin 
Ebner) or Pwo DNA polymerase. DNA fragments were 
isolated from agarose gels using the Qiagcn extraction kit 
(Qiagen GMBH), following the instructions of the suppliers. 
Lactobacillus reuteri strain 121 (LMG 18388) was grown at 
37** C. in MRS medium (DIFCO) or in MRS-s medium 
(MRS medium containing 100 g/1 sucrose instead of 20 g/1 
glucose). When fructo-oligosaccharides production was 
investigated phosphate was omitted and ammonium citrate 
was replaced by ammonium nitrate in the MRS-s medium. 
E, coli strains were grown aerobically at 37^ C. in LB 
medium, where appropriate supplemented with 50 //g/ml 
ampicillin (for selection of recombinant plasmids) or with 
0.02% (w/v) arabinose (for induction of the inulosucrasc 
gene). 

Total DNA of Lactobacillus reuteri was isolated accord- 
ing to Verhasselt et al. (1989) FEMS Microbiol. Lett. 59, 
135-140 as modified by Nagy et al. (1995) J. Bacteriol. 177, 
676-687. 

The inulosucrasc gene was identified by amplification of 
chromosomal DN A o£ Lactobacillus reuteri with PCR using 
degenerated primers (5 ftf, 6 ftfi, and 12 ftti, see table 1) 
based on conserved amino acid sequences deduced from 
different bacterial fructosyltranferasc genes (SacB of Bacil- 
lus amyloUquefaciens, SacB oi Bacillus subtilis, Streptococ- 
cus mutans fructosyltransferase and Streptococcus salt- 
varius fructosyltransferase, see FIG. 4) and Lactobacillus 
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reuteri DNA as template. Using primers 5 ftf and 6 ftfi, an 
amplification product with the predicted size of about 234 bp 
was obtained (FIG. 5 A), lliis 234 bp fragment was cloned 
in E. coli JM109 using the pCR2.1 vector and sequenced. 
Transformations were performed by electroporation using 
the BioRad gene pulse r apparatus at 2.5 kV, 25 //F and 200 
Q, following the instructions of the manufacturer. Sequenc- 
ing was performed according to the method of Sanger et al, 
(1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. Analysis 
of the obtained sequence data confirmed that part of a 
fructosyltransferase (ftf gene had been isolated. The 234 bp 
amplified fragment was used to design primers 7 ftf and 8 ftfi 
(see table 1). PCR with the primers 7 ftf and 12 ftfi gave a 
product of the predicted size of 948 bp (see FIG. 5B); its 
sequence showed clear similarity with previously character- 
ized fructosyltransferase genes. The 948 bp amplified frag- 
ment was used to design the primers ftf ACl(i) and ftfAC2(i) 
(see table 1) for inverse PCR. Using inverse PCR techniques 
a 1438 bp fragment of the inulosucrasc gene was generated, 
including the 3' end of the inulosucrasc gene (see FIG. 5C)- 
ITie remaining 5' fragment of the inulosucrasc gene was 
isloated with a combination of standard and inverse PCR 
techniques. Briefly, Lactobacillus reuteri DNA was cut with 
restriction enzyme Xhol and ligated. PCR with the primers 
7 ftf and 8 ftfi, using the ligation product as a template, 
25 yielded a 290 bp PCR product which was cloned into 
pCR2.1 and sequenced. This revealed that primer 8 ftfi had 
annealed aspecifically as well as specifically yielding the 
290 bp product (see FIG. 5D). 

At this time, the N-tcrminal amino acid sequence of a 
30 fructosyltransferase enzyme (FTFB) purified from the Lac- 
tobacillus reuteri strain 121 was obtained. This sequence 
consisted of the following 23 amino acids: QVESNNYN- 
GVAEVNTERQANGQI (residues 2-24 of SEQ ID No. 6). 
The degenerated primer 1 9 ftf (YNGVAEV) (residues 8-14 
35 of SEQ ID NO: 6) was designed on the basis of a part of this 
N-tcrminal peptide sequence and primer 20 ftfi was designed 
on the 290 bp PCR product. PCR with primers 19 ftf and 20 
ftfi gave a 754 bp PCR product (see FIG. 5E), which was 
cloned into pCR2.1 and sequenced. Both DNAstrandsof the 
40 entire fructosyltransferase gene were double sequenced. In 
this way the sequence of a 2.6 kb region of the Lactobacillus 
reuteri DNA, containing the inulosucrase gene and its 
surroundings were obtained. 
The plasmids for expression of the inulosucrase gene in E. 
45 coli ToplO were constructed as described hereafter. A 2414 
bp fragment, containing the inulosucrase gene starting at the 
first putative start, codon at position 41, was generated by 
PCR, using primers ftfAl and ftfA2i. Both primers con- 
tained suitable restriction enzyme recognition sites (a Ncol 
50 site at the 5'cnd of ftfAl and a Bglll site at the 3'end of 
ftfA2i). PCR with Lactobacillus reuteri DNA, Pwo DNA 
polymerase and primers ftfAl and ftfAZi yielded the com- 
plete inulosucrase gene flanked by Ncol and Bglll restriction 
sites, llie PCR product with blunt ends was ligated directly 
55 into pCRbluntll-Topo. Using the Ncol and Bglll restriction 
sites, the putative ftfA gene was cloned into the expression 
vector pBAD, downstream of the inducible arabinose pro- 
moter and in frame upstream of the Myc epitope and the His 
tag. The pBAD vector containing the inulosucrase gene 
60 (pSVHlOl) was transformed to E, coli ToplO and used to 
study inulosucrase expression. Correct construction of plas- 
mid containing the complete inulosucrdse gene was con- 
firmed by restriction enzyme digestion analysis and by 
sequence analysis, showing an in frame cloning of the 
65 inulosucrase gene using the ribosomal binding site provided 
by the pBAD vector and the first putative start codon (at 
position 41) of inulosucrase (see FIG. 1), 
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Plasmid DNA of E. coli was isolated using the alkaline 
lysis method of Birnboim and Doly (1979) Nucleic Acids 
Res. 7, 1513-1523 or with a Qiagen plasmid kit following 
the instructions of the suppher. Cells of £. coli Top 10 with 
pSVHlOl were grown overnight in LB medium containing 
0.02% (w/v) arabinose and were harvested by centrifuga- 
tion. The pellet was washed with 25 mM sodium acetate 
buffer pH 5.4 and the suspension was ccntrifuged again. 
Pelleted cells were rcsuspcnded in 25 mM sodium acetate 
buffer pH 5.4. Cells were broken by sonication. Cell debris 
and intact cells were removed by centrifugation for 30 min 
at 4^* C:. at 10,(K)()xg and the resulting cell free extract was 
used in the enzyme assays. 

The fructosyltranferase activities were determined at 37® 
C. in reaction buffer (25 mM sodium acetate, pH 5.4, 1 mM 
CaCU, 100 g/1 sucrose) by monitoring the release of glucose 
from sucrose, by detecting fructo-oligosaccharides or by 
determining the amount of fructan polymer produced using 
E. coli cell free extracts or Lactobacillus reuteri culture 
supernatant as enzyme source. Sucrose, glucose and fructose 
were determined enzymatically using commercially avail- 
able kits. 

Fructan production by Lactobacillus reuteri was studied 
with cells grown in MRS-s medium. Product formation was 
also studied with cell-free extracts of E. coli containing the 
novel inulosucrase incubated in reaction buffer (1 mg 
protein/10 ml buffer, incubated overnight at 37° C). Fruc- 
lans were collected by precipitation with elhanol. ^H-NMR 
spectroscopy and methylation analysis were performed as 
described by van Gcel-Schutten et al. (1999) Appl. Environ. 
Microbiol. 65, 3008-3014. The molecular weights of the 
fructans were determined by high performance size exclu- 
sion chromatography coupled on-line with a multi angle 
laser light scattering and a differential refractive index 
detector. Fructo-oligosaccharide synthesis was studied in 
Lactobacillus reuteri culture supematants and in extracts of 
E. coli cells containing the novel inulosucrase incubated in 
reaction buffer (1 mg protein/10 ml buffer, incubated over- 
night at 37** C). Glucose and fructose were determined 
eazymatically as described above and fructo- 
oligosaccharides produced were analyzed using a Dioaex 
column. The incubation mixtures were ccntrifuged for 30 
min at 10,000xg and diluted 1:5 in a 100% DMSO solution 
prior to injection on a Dionex column. A digest of inulin 
(DPl-20) was used as a standard. Separation of compounds 
was achieved with anion -exchange chromatography on a 
CarboPac Pal column (Dionex) coupled to a CarboPac PAl 
guard column (Dionex). Using a Dionex GP50 pump the 
following gradient was generated; % eluent B is 5% (0 min); 
35% (10 min); 45% (20 min); 65% (50 min); 100% (54-60 
min); 5% (61-65 min). Eluent A was 0.1 M NaOH and 
eluent B was 0.6 M NaAc in a 0.1 M NaOH solution. 
Compounds were detected using a Dionex ED40 electro- 
chemical detector with an AU working electrode and a 
Ag/AgCl reference-electrode with a sensitivity of 300 nC. 
The pulse program used was: +0.1 Volt (0-0.4 s); +0.7 Volt 
(0.41-0.60 s); -0.1 Volt (0.61-'1.00 s). Data were integrated 
using a Pcrkin Elmer Turbochrom data integration system. A 
different separation of compounds was done on a cation 
exchange column in the calcium form (Benson BCX4). As 
mobile phase Ca-EDTA in water (100 ppm) was used. The 
elution speed was 0.4 ml/min at a column temperature of 85° 
C. Detection of compounds was done by a refractive index 
(Jasco 830-RI) at 40® C. Quantification of compounds was 
achieved by using the software program Turbochrom 
(Perkin Elmer). 

SDS-PAGE was performed according to Laemmli (1970) 
Nature 227, 680-685 using 7.5% polyacrylamide gels. After 



electrophoresis gels were stained with Coomassie Briljant 
Blue or an activity staining (Periodic Acid Schiff, PAS) was 
carried out as described by Van Geel-Schutten et al. (1999) 
Appl. Environ. Microbiol. 65, 3008-3014. 

TABLE 1 

Nucleotide sequence of primers used in PGR reactions to identify the 

inulosucrase gene. 



Primer 
name 



1.x>cation 

(bp) 



Nucleotide sequence (and SEQ ID No) 



15 



20 



25 



ftfACl 1176 CTG-ATA-ATA-ATG-GAA-ATG-TAT-CAC 
(SEQ ID No. 12) 

ftfAC2i 1243 CAT-GAT-CAT-AAG-TTT-GGT-AGT-AAT-AG 

(SEQ ID No. 13) 

ftfecl 1176 G'IXj-ATA-CAi-llX:-CAl-IAT-'l'Al-CAG 

(SEQ ID No. 14) 

ftfAC2 1 243 CrA-TTA-CTA-CCA-AAC-TrA-TGA-TCA-TG 

(SEQ ID No. 15) 

ft£Al CCA-TGG-CCA-TGG TAG-AAC-GCA-AGG- 

AAC-ATA-AAA-AAA-TG 
(SEQ ID No. 16) 

ftfA2i AGA-TCr-AGA-TCr-G1T-AAA-TCG-ACG-nT- 
GTT-AAr-TTC-TG 
(SEQ ID No. 17) 

5ftf 845 GAY-GTN-TGG-GAY-WSN-TGG-GCC 

(SEQ ID No. 18) 

6ftfi 1052 GTN.GCN-SWN-CCN-SWC-CAY-TSY-TG 

(SEQ ID No. 19) 

7ftf 1009 GAA-TGT-AGG-TCX-AAT-nr-TGG-C 

(SEQ ID No. 20) 

8ftfi 864 CCr-GTC-CGA-ACA-TCT-TGA-ACr-G 

(SEQ ID No. 21) 

30 12ftfi 1934 ARR-AAN-SWN-GGN-GCV-MAN-G'IN-SW 

(SEQ ID No. 22) 

19ftf 1 TAY-AAY-GGN-GTN-GCN-GAR-GTN-AA 

(SEQ ID No. 23) 

20flfi 733 CCG-ACC-ATC-TTG-ITT-GAT-TAA-C 

(SEQ ID No. 24) 

35 

Listed from left to right arc: primer name (i, inverse primer), location (in 

bp) in [t£A and the sequence from 5' to 3' according to lUB group codes 
(N » any base; M = A or C; R = A or G; W - A or T, S » C or G; Y ^ C 
or T; K - G or T; B - not A; D - not Q H - not G; and V =« not T). 

40 EXAMPLE 2 

Purification and Amino Acid Sequencing of the Levansu- 
crase (FTFB). 
Protein Purification 

Samples were taken between each step of the purification 

45 process to determine the enzyme activity (by glucose GOD- 
Perid method) and protein content (by Bradford analysis and 
acrylamide gel electrophoresis). Collected chromatography 
fractions were screened for glucose hberating activity 
(GOD-Perid method) to determine the enzyme activity. 

50 One liter of an overnight cuhure of LB121 cells grown on 
MRS medium containing 50 grams per liter maltose was 
centrifuged for 15 min. at 10,000xg. The supernatant was 
precipitated with 1.5 liter of a saturated ammonium sulphate 
solution. The ammonium sulphate solution was added at a 

55 rate of 50 ml/min. under continuous stirring. The resuhing 
60% (w/v) ammonium sulphate solution was centrifuged for 
15 min. at 10,000xg. The precipitate was resuspended in 10 
ml of a sodium phosphate solution (10 mM, pH 6.0) and 
dialysed overnight against 10 mM sodium phosphate, pH 

60 6.0. 

A hydro xylapatite column was washed with a 10 mM 
sodium phosphate solution pH 6.0; the dialysed sample was 
loaded on the column. After eluting the column with 200 
mM sodium phosphate, pH 6.0 the eluted fractions were 
65 screened for glucose releasing activity and fractions were 
pooled for phenyl superose (a hydrophobic interactions 
column) chromatography. I'he pooled fractions were diluted 
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1:1 (v:v) with 25 mM sodium acetate, 2 M ammonium 
sulphate, pH 5.4 and loaded on a phenyl superose column 
(washed with 25 mM sodium acetate, 1 M ammonium 
sulphate, pH 5.4). In a gradient from 25 mM sodium acetate, 
1 M ammoniuni sulphate, pH 5.4 (A) to 25 mM sodium 
acetate, pH 5.4 (B) fractions were collected from 35% B to 
50% B. 

Pooled fractions from the phenyl superose column were 
loaded on a gel filtration (superdex) column and cluted by a 
25 mM acetate, 0.1 M sodium chloride, pH 5.4 buffer. The 
superdex fractions were loaded on a washed (with 25 mM 
sodium acetate, pH 5.4) Mono Q column and eluted with 25 
mM sodium acetate, 1 M sodium chloride, pH 5.4. The 
fractions containing glucose liberating activity were pooled, 
dialysed against 25 mM sodium acetate, pH 5.4, and stored 
at -20° C. 

Alevansucrase enzyme was purified from LB121 cultures 
grown on media containing maltose using ammonium sul- 
fate precipitation and several chromatography column steps 
(table 2). Maltose (glucose — glucose) was chosen because 
both glucansucrase and levansucrase can not use maltose as 
substrate. LB121 will grow on media containing maltose but 
will not produce polysaccharide. From earlier experiments it 
was clear that even with harsh methods the levansucrase 
enzyme could not be separated from its product levan. These 
harsh methods included boiling the levan in a SDS solution 
and treating the levan with HCl and TFA. No levanase 
enzyme was commercially available for the enzymatic 
breakdown of levan. Only a single levansucrase was 
detected in maltose culture supernal ants. In order to prove 
that the enzyme purified from maltose culture supernatant is 
the same enzyme which is responsible for the levan produc- 
tion during growth on raflSnose, biochemical and biophysi- 
cal tests were performed, 

TABLE 2 

Purification of the Lactobacillus rcuteri 1..B 121 







Total 


Specific 








Protein 


Activity 


Activity 


Purification 


Yield 


Step 


(mg) 


(U) 


(U/mg) 


(fold) 


(%) 


Supern^ant 


128 


64 


0.5 


1 


100 


Ammoniuni 


35.2 


42 


1.2 


2.4 


65.6 


sulfate 












precipitation 












(65%) 












Hydroxyl 


1.5 


30.6 


20.4 


40.8 


47.8 


apatite 












Phenyl 


0.27 


23 


85 


170 


36 


superose 












Gel 


0.055 


10 


182 


360 


16 


Filtration 












MoooQ 


0.0255 


4 


176 


352 


6 



Amino Acid Sequencing of FrFB 

A 5% SDS-PAA gel was allowed to "age" overnight in 
order to reduce the amount of reacting chemical groups in 
the gel. Reaction of chemicals in the PAA gel (TEMED and 
ammonium persulphate) with proteins can cause some 
undesired effects, such as N-terminal blocking of the 
protein, making it more difficult to determine the protein 
amino acid composition. 0.1 mM thioglycolic acid 
(scavenger to reduce the amount of reactive groups in the 
PAA gel material) was added to the running buffer during 
electrophoresis. 

In order to determine the amino acid sequence of internal 
peptides of protein bands running in a SDS-PAA gel, protein 
containing bands were cut out of the PAA gel. After firac- 



15 



20 



25 



30 



tionating the protein by digestion with chymotrypsin the 
N-terminal amino acid sequences of the digested proteins 
were determined (below). 

N-terminal sequencing was performed by Western blot- 
ting of the proteins from the PAA gel to an Immobilon PVDF 
membrane (MilliporeAVaters Inc.) at 0.8 mA/cm^ for 1 h. 
After staining the PVDF membrane with Coomassie Bril- 
liant Blue without adding acetic acid (to reduce N-lcrrainal 
blocking) and destaining with 50% methanol, the corre- 
sponding bands were cut out of the PVDF membrane for 
N-terminal amino acid sequence determination. 

Amino acid sequence determination was performed by 
automated Edman degradation as described by Koningsberg 
and Steinman (1977) The proteins (third edition) volume 3, 
1-178 (Neurath and Hill, eds.). llie automated equipment 
for Edman degradation was an Applied Biosystems model 
477A pulse-liquid sequenator described by Hewick et al. 
(1981), J. Biol. Chem. 15, 7990-7997 connected to a 
RP-IIPLC unit (model 120A, Applied Biosystems) for 
amino acid identification. 

The N-terminal sequence of the purified FTFB was deter- 
mined and found tobe:(A)QVESNNYNGVAEV 
N TE R Q AN G Q I (G) (V) (D) (SEQ ID No. 6). Three 
internal peptide sequences of the purified FTFB were deter- 
mined: (M) (A) H LD VW D S WP V Q D P(V) (SEQ ID 
No. 7); N AG S I F G T (K) (SCO ID No. 8); and V (E) (E) 
VYSPKVSTLMASDEVE (SEQ ID No. 9). 

The following primers were designed on the basis of the 
N-terminal and internal peptide fragments of FTFB. Listed 
from left to right are: primer name, source peptide fragment 
and sequence (from 5' to 3'). FTFBl+FTFB3i yields 
approximately a 1400 bp product in a PCR reaction. FTFBl 
forward (N-terminal): AA T/C-TAT-AA T/C-GG T/C-GTT- 
GC G/A-T/C GA-AGT (SEQ ID No. 25); and FTFB3i 
reverse (Internal 3): TAC-CGN-A/T C/G N-CTA-CTT- 
CAA-CIT (SEQ ID No. 26). Ilie FFFB gene was partly 
isolated by PCR with primers FTFBl and FTFB3i. PGR 
with these primers yielded a 1385 bp amplicon, which after 
sequencing showed high homology to ftfA and SacB from 
Streptococcus /nutans, 

EXAMPLE 3 

Oxidation of Levans 

For TEMPO-mediated oxidation, a levan according to the 
invention prepared as described above (dry weight 1 g, 6.15 
mmol) was resuspended in 100 ml water. Next, 2,2,6,6- 
tetramethylpiperidine-l-oxyl (TEMPO; 1% by weight com- 
pared to the polysaccharide (0.01 g, 0.065 mmol)) was 
added and resuspended in 20 min. Sodium bromide (0.75 g, 
7.3 mmol) was added and the suspension was cooled down 
to 0° C. This reaction also procedcd without bromide. A 
solution of hypochlorite (6 ml, 15% solution, 12.6 mmol) 
was adjusted to pH 10.0 with 3M HCl and cooled to O'* C. 
This solution was added to the suspension of the polysac- 
charide and TEMPO. The course of the reaction was fol- 
lowed by monitoring the coasumption of sodium hydroxide 
solution, which is equivalent to the formation of uronic acid. 
After 30 min, 60 ml O.IM NaOH was consumed. This 
amount corresponds to the formation of 97% uronic acid. 
Thereafter, the solution was poured out in 96% ethanol 
60 (comprising 70% of the volume of the solution) causing the 
product to precipitate. The white precipitate was 
centrifuged, resuspended in ethanol/water (70/30 v/v) and 
centrifuged again. Next, the precipitate was resuspended in 
96% ethanol and centrifuged. The obtained product was 
65 dried at reduced pressure. The uronic acid content was 
determined by means of the uronic acid assay according to 
Blumenkrantz and Abdoe-Hansen (Anal. Biochem., 54 



40 



45 



55 
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(1973), 484). A calibration curve was generated using 
polygalacturonic acid (5, 10, 15 and 20 //g). With this 
calibration curve the uronic acid content in a sample of 20 
/<g of the product was determined. The obtained result was 
a content of 95% uronic acid with a yield of 96%. ^ 
Partial Oxidation 

For partial oxidation, a levan according to the invention 
(dry weight 2 g, 12.3 mmol) was resuspended in 25 ml water. 
Next, TEMPO (1% by weight compared to the polysaccha- 
ride (0.02 g, 0.13 mmol)) was added, resuspended in 20 min 
and cooled to 0® C. A solution of hypochlorite (1 ml, 15% 
solution, 2.1 mmol) was adjusted to pH 9.0 with 3M HCl and 
cooled down to 0** C. This solution was added to the 
suspension of the polysaccharide and TEMPO. Within 5 min 
the mixture became a solid gel. 



EXAMPLE 4 

Adhesion of Lactobacillus reuteri Strains to Caco-2 Cell 
Lines 

The adhesion of Lactobacillus reuteri strains to Caco-2 
cell lines was determined as described below. Firstly, a 
bacterial .suspension was prepared as follows. Lactobacillus 
reuten strains LB 121, 35-5, K24 and DSM20016 and L. 
rhmmosus LGG (a well known probiotic strain with good 
adhering properties) were cultured in MRS broth supple- 
mented with 5 /d/ml of methyl-l,2-pH]-thymidine at 37" C. 
for 18-20 h before the adhesion assays. The cultures were 
harvested by centrifugation, washed with phosphate buff- 
ered saline (PBS) and resuspended in PBS or PBS supple- 
mented with 30 g/1 sucrose (see Table 3) to a final density of 
about 2x10^ cfu/ml. Prior to the adhesion assay, the cell 
suspensions in PBS with 30 g/1 sucrose were incubated for 
1 hour at 37** C, whereas the cell suspensions in PBS were 
kept on ice for 1 hour. After incubation at 37° C, the 
suspensions in PBS with sucrose were centrifuged and the 
cells were washed with and resuspended in PBS to a final 
density of about 2x10^ cfu/ml. 

Caco-2 cells were cultured as follows. Subcultures of 
Caco-2 cells (ATCC, code HTB 37, human colon 
adenocarcinoma), stored as frozen stock cultures in liquid 
nitrogen were used for the adhesion tests. The Caco-2 cells 
were grown in culture medium consisting of Dulbecco*s 
modified Eagle medium (DMEM), supplemented with heat- 
inactivated foetal calf serum (10% v/v), non-essential amino 
acids (1% v/v), L-glutamine (2 mM) and gentamicin (50 
//g/ml). About 2,000,000 cells were seeded in 75 cm^ tissue 
culture flasks containing culture medium and cultured in a 
humidified incubator at 37® C. in air containing 5% CO2. 
Near confluent Caco-2 cell cultures were harvested by 
trypsinisation and resuspended in culture medium. The 
number of cells was established using a Burker-Turk count- 
ing chamber. 

TABLE 3 

Incubation of the different Lactobacillus strains prior to the 
adhesion assays. 



Lactobacillus 

strain Extra LucubatLon 



Polysaccharide 
produced 



Group 



reuteri 121 PBS sucrose, 37^ C. for glucan and fructan As 
I hr 

reuteri 35-5 PBS sucrose, 37® C for glucan Bs 
1 hr 

reuteri K24 PBS sucrose, 37* C. for none Cs 
1 hr 



25 



30 



35 



40 



45 
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TABLE 3-continued 



Incubation of the different Lactobacillus strains prior to the 




adhesion assays. 


Lactobacillus 




Polysaccharide 


strain 


Extra incubation 


produced Group 


reuteri 121 


PBS on ice 


none D 


reuteri 


PBS on ice 


none E 


DSM20016* 






rhamnosus GG 


PBS on ice 


none F 



*T^'pc strain of L. reuteri 



For the following experiments a Caco-2 monolayer trans- 
port system was used. Caoo-2 cells cultured in a two- 
compartment transport system are commonly used to study 
the intestinal, epithelial permeability. In this system the 
Caco-2 cell differentiates into polarized columnar cells after 
reaching confluency. The Caco-2 system has been shown to 
simulate the passive and active transcellular tranport of 
electrolytes, sugars, amino acids and lipophilic compounds 
(Hillgren et al. 1995, Dulfer et al., 1996, Duizer et aL, 1997). 
Also, a clear correlation between the in vivo absorption and 
the permeability across the monolayers of Caco-2 cells has 
been reported (Artursson and Karlsson, 1990). For the 
present transport studies, Caco-2 cells were seeded on 
.semi -permeable filter inserts (12 wells 1 Vans well plates, 
Costar) at ca. 100,000 cells per filter (growth area ±1 cm^ 
containing 2.5 ml culture medium). The cells on the insert 
were cultured for 17 to 24 days at 37** C. in a humidified 
incubator containing 5% COo in air. During this culture 
period the cells have been subjected to an enterocyle-like 
differentiation. Gentamycin was eliminated from the culture 
medium two days prior to the adhesion assays. 

The adhesion assay was performed as follows. PBS was 
used as exposure medium. 25 /<1 of a bacterial suspension 
(2x10^ cfu/ml ) were added to 0.5 ml medium. The apical 
side of the Caco-2 monolayers was incubated with the 
bacterial suspensions for 1 hoiu: at 37* C, After incubation, 
remaining fluid was removed and the cells were washed 
three times with 1 ml PBS. Subsequently, the Caco-2 mono- 
layers were digested overnight with 1 ml O.IM NaOH, 1% 
SDS. The lysate was mixed with 10 ml Hionic Fluor 
scintillation liquid and the radioactivity was measured by 
liquid scintillation counting using a LKBAVallac scintilla- 
tion counter. As a control, the radioactivity of the bacterial 
suspensions was measured. For each test group, the percent- 
age of bacteria attached to the monolayers was calculated. 
All adhesion tests were performed in quadruple. In Table 4 
50 the results of the bacterial adhesion test to Caco-2 ccllincs 
are given. From the results can be concluded that the glucans 
and the fructans contribute to the adherence of Lactobacillus 
reuteri to Caco-2 cellines. This could indicate that Lacto- 
bacillus reuteri strains producing EPS possess improved 
55 probiotic characteristics or that Lactobacillus reuteri and its 
polysaccharides could function as an exellent symbiotic. 



TABLE 4 



60 



65 



ITie results of the bacterial adhesion test to Caco-2 cellines. 



Group 
(see Ibble 
1) 



0% of bacteria 
bound to the 
monolayer 



As 
Bs 
Cs 



6.5 
5.7 
1.8 
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lABLE 4-continued 



The results of the bacterial adhesion test to Caco-2 eel lines. 


Group 


0% of bacteria 


(see Tbble 


bound to the 


1) 


monolayer 


D 


23 


E 


0.9 


F 


13 



DESCRIPTION OF THE FIGURES 

FIG. 1: The nucleic acid (SEQ ID NO: 4) and deduced 
amino acid sequences (SEQ ID NOS 27 and 1) of the novel 
inulosucrase of LactohaciUiLS reuteri. Also encompassed 
within the figure is the comparison peptide (SEQ ID NO: 
28). Furthermore, the designations and orientation (< for 3* 
to 5' and > for 5' to 3') of the primers and the restriction 
enzymes used for (inverse) PGR, arc shown at the right hand 
side. Putative start codons (ATG, at positions 41 and 68) and 
stop codon (TAA, at position 2435) are shown in bold. The 
positions of the primers used for PGR are shown in bold/ 
underlined. The Nhel restriction sites (at positions 1154 and 
2592) used for inverse PGR are underlined. The primers 
used and their exact positions in the inulosucrase sequence 
are shown in table 1. Starting at amino acid 690, the 20 PXX 
(residues 690-749 of SEQ ID NO: 1) repeats are underlined. 
At amino acid 755 the LPXTG (SEQ ID NO: 5) motif is 
underlined. 

FIG. 2: Dendrogram of bacterial and plant fructosyUrans- 
ferases. The horizontal distances are a measure for the 
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difference at the amino acid sequence level. 10% difference 
is indicated by the upper bar. Bootstrap values (in 
percentages) are given at the root of each tree. Fructosyl- 
transfe rases of Gram positive bacteria are indicated in the 
5 lower half of the figure {B. staerothertnophilus SurB; B. 
amyloliquefaciens SacB; B. subtilis SacB; S. mutans SacB; 
L. reiaeri FtfA (inulosucrase); S. salivarius Ftt). Plant fruc- 
tosyltransferases are indicated in the middle part of the 
figure (Cynara scolymns Ss-1 ft; Allium cepa F-6 gft; 
10 Hordeuni mlgare Sf-6 ft). Fructosyltransferases of Gram 
negative bacteria are shown in the upper part of the figure (Z. 
mobilis LevU; Z. mobilis SucE2; Z mobilis SacB; E. amy- 
lovora Lx;s; A. diazotrophicusljidA). 

FIG. 3: The N-terminal (SEQ ID NO: 6) and three internal 
35 amino acid sequences (SEQ ID NOS 7-9) of the novel 
levansucrase of Lactobacillus reuteri, 

FIG. 4: Parts of an alignment of the deduced amino acid 
sequences of some bacterial fructosyltransferase genes 
(SEQ ID NOS 29-40). Sequences in bold indicate the 
consensus sequences used to construct the degenerated 
primers 5 ftf, 6 ftfi and 12 ftfi. (*) indicates a position with 
a fully conserved amino acid residue. (:) indicates a position 
with a fiiUy conserved 'strong' group: STA, NEQK, NIIQK, 
NDEQ, QHRK, MILV, MILF, HY, FYW. (.) indicates a 
position with a fully conserved 'weaker' group: CSA, ATV, 
SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, 
NEQHRK, FVLIM, HFY. Goups are according to the 
Para250 residue weight matrix described by Altschul et al. 
(1990) J. Mol. Biol. 215, 403-410. 

FIG. 5: The strategy used for the isolation of the inulo- 
sucrase gene from Lactobacillus reuteri 121 chromosomal 
DNA. 



SEQUENCE LISTING 



<160> NUMBER OF SEQ ID NOS: 40 

<210> SEQ ID NO 1 
<211> LENGTH: 789 
<212> TYPE: PRT 

<213> ORGANISM; Lactobacillus reuteri 
<400> SEQUENCE: 1 

Met Tyr Lys Ser Gly Lys Asn Trp Ala Val Val Thr Leu Ser Thr Ala 
15 10 15 

Ala Leu Val Phe Gly Ala Thr Thr Val Asn Ala Ser Ala Asp Thr Asn 

20 25 30 

lie Glu Asn Asn Asp Ser Ser Thr Val Gin Val Thr Thr Gly Asp Asn 
35 40 45 

Asp lie Ala Val Lys Ser Val Thr Leu Gly Ser Gly Gin Val Ser Ala 
50 55 60 

Ala Ser Asp Thr Thr lie Arg Thr Ser Ala Asn Ala Asn Ser Ala Ser 
65 70 75 80 

Ser Ala Ala Asn Thr Gin Asn Ser Asn Ser Gin Val Ala Ser Ser Ala 
85 90 95 

Ala lie Thr Ser Ser Thr Ser Ser Ala Ala Ser Leu Asn Asn Thr Asp 
100 105 110 

Ser Lys Ala Ala Gin Glu Asn Thr Asn Thr Ala Lys Asn Asp Asp Thr 
115 120 125 



Gin Lys Ala Ala Pro Ala Asn Glu Ser Ser Glu Ala Lys Asn Glu Pro 
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-continued 



130 



135 



140 



Ala Val Asn Val Asn Asp Ser Ser Ala Ala Lys ABn Asp Asp Gin Gin 
145 150 155 160 

Ser Ser Lys Lys Asn Thr Thr Ala Lys Leu Asn Lys Asp Ala Glu Asn 
165 170 175 

Val Val Lys Lys Ala Gly He Asp Pro Asn Ser Leu Thr Asp Asp Gin 
180 185 190 

He Lys Ala Leu Asn Lys Met Asn Phe Ser Lys Ala Ala Lys Ser Gly 
195 200 205 

Thr Gin Met Thr Tyr Asn Asp Phe Gin Lys He Ala Asp Thr Leu He 
210 215 220 

Lys Gin Asp Gly Arg Tyr Thr Val Pro Phe Phe Lys Ala Ser Glu He 
225 230 235 240 

Lys Asn Met Pro Ala Ala Thr Thr Lye Asp Ala Gin Thr Asn Thr He 

245 250 255 

Glu Pro Leu Asp Val Trp Asp Ser Trp Pro Val Gin Asp Val Arg Thr 
260 265 270 

Gly Gin Val Ala Asn Trp Asn Gly Tyr Gin Leu Val He Ala Met Met 
275 280 285 

Gly He Pro Asn Gin Asn Asp Asn His He Tyr Leu Leu Tyr Asn Lys 

290 295 300 

Tyr Gly Asp Asn Glu Leu Ser His Trp Lys Asn Val Gly Pro He Phe 

305 310 315 320 

Gly Tyr Asn Ser Thr Ala Val Ser Gin Glu Trp Ser Gly Ser Ala Val 
325 330 335 

Leu Asn Ser Asp Asn Ser He Gin Leu Phe Tyr Thr Arg Val Asp Thr 
340 345 350 

Ser Asp Asn Asn Thr Asn His Gin Lys He Ala Ser Ala Thr Leu Tyr 
355 360 365 

Leu Thr Asp Asn Asn Gly Asn Val Ser Leu Ala Gin Val Arg Asn Asp 

370 375 380 

Tyr He Val Phe Glu Gly Asp Gly Tyr Tyr Tyr Gin Thr Tyr Asp Gin 
365 390 395 400 

Trp Lys Ala Thr Asn Lys Gly Ala Asp Asn He Ala Met Arg Asp Ala 
405 410 415 

His Val He Glu Asp Gly Asn Gly Asp Arg Tyr Leu Val Phe Glu Ala 
420 425 430 

Ser Thr Gly Leu Glu Asn Tyr Gin Gly Glu Asp Gin He Tyr Asn Trp 

435 440 445 

Leu Asn Tyr Gly Gly Asp Asp Ala Phe Asn He Lys Ser Leu Phe Arg 
450 455 460 

He Leu Ser Asn Asp Asp He Lys Ser Arg Ala Thr Trp Ala Asn Ala 
465 470 475 480 

Ala He Gly He Leu Lys Leu Asn Lys Asp Glu Lys Asn Pro Lys Val 
485 490 495 

Ala Glu Leu Tyr Ser Pro Leu He Ser Ala Pro Met Val Ser Asp Glu 
500 505 510 

He Glu Arg Pro Asn Val Val Lys Leu Gly Asn Lys Tyr Tyr Leu Phe 
515 520 525 

Ala Ala Thr Arg Leu Asn Arg Gly Ser Asn Asp Asp Ala Trp Met Asn 
530 535 540 

Ala Asn Tyr Ala Val Gly Asp Asn Val Ala Met Val Gly Tyr Val Ala 
545 550 555 560 
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-continued 



Asp Ser Leu Thr Gly Ser Tyr Lys Pro Leu Asn Asp Ser Gly Val Val 

565 570 575 

Leu Thr Ala Ser Val Pro Ala Asn Trp Arg Thr Ala Thr Tyr Ser Tyr 
580 585 590 

Tyr Ala Val Pro Val Ala Gly Lys Asp Asp Gin Val Leu Val Thr Ser 
595 600 605 

Tyr Met Thr Asn Arg Asn Gly Val Ala Gly Lys Gly Met Asp Ser Thr 
610 615 620 

Trp Ala Pro Ser Phe Leu Leu Gin lie Asn Pro Asp Asn Thr Thr Thr 

625 630 635 640 

Val Leu Ala Lys Met Thr Asn Gin Gly Asp Trp lie Trp Asp Asp Ser 
645 650 655 

Ser Glu Asn Leu Asp Met lie Gly Asp Leu Asp Ser Ala Ala Leu Fro 
660 665 670 

Gly Glu Arg Asp Lys Pro Val Asp Trp Asp Leu lie Gly Tyr Gly Leu 
675 680 685 

Lys Pro His Asp Pro Ala Thr Pro Asn Asp Pro Glu Thr Pro Thr Thr 

690 695 700 

Pro Glu Thr Pro Glu Thr Pro Asn Thr Pro Lys Thr Pro Lys Thr Pro 
705 710 715 720 

Glu Asn Pro Gly Thr Pro Gin Thr Pro Asn Thr Pro Asn Thr Pro Glu 
725 730 735 

lie Pro Leu Thr Pro Glu Thr Pro Lys Gin Pro Glu Thr Gin Thr Asn 
740 745 750 

Asn Arg Leu Pro Gin Thr Gly Asn Asn Ala Asn Lys Ala Met lie Gly 

755 760 765 

Leu Gly Met Gly Thr Leu Leu Ser Met Phe Gly Leu Ala Glu lie Asn 
770 775 780 

Lys Arg Arg Phe Asn 
785 



<2i0> SEQ ID NO 2 
<211> LENGTH: 2367 
<212> TYPE: DMA 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 2 

atgtataaaa gcggtaaaaa ttgggcagtc gttacactct cgactgctgc gctggtattt 60 

ggtgcaacaa ctg-taaatgc atccgcggac acaaatattg aaaacaatga ttcttctact 120 

gtacaagtta caacaggtga taatgatatt gctgttaaaa gtgtgacact tggtagtggt 180 

caagttagtg cagctagtga tacgactatt agaacttctg ctaatgcaaa tagtgcttct 240 

tctgccgcta atacacaaaa ttctaacagt caagtagcaa gttctgctgc aataacatca 300 

tctacaagtt ccgcagcttc attaaataac acagatagta aagcggctca agaaaatact 360 

aatacagcca aaaatgatga cacgcaaaaa gctgcaccag ctaacgaatc ttctgaagct 420 

aaaaatgaac cagctgtaaa cgttaatgat tcttcagctg caaaaaatga tgatcaacaa 480 

tccagtaaaa agaatactac cgctaagtta aacaaggatg ctgaaaacgt tgtaaaaaag 540 

gcgggaattg atcctaacag tttaactgat gaccagatta aagcattaaa taagatgaac 600 

ttctcgaaag ctgcaaagtc tggtacacaa atgacttata atgatttcca aaagattgct 660 

gatacgttaa tcaaacaaga tggtcggtac acagttccat tctttaaagc aagtgaaatc 720 

aaaaatatgc ctgccgctac aactaaagat gcacaaacta atactattga acctttagat 780 
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gtatgggatt 



catggccagt 



tcaagatgtt 



cggacaggac aagttgctaa ttggaatggc 



840 



tatcaacttg 



-tca-tcgcaat 



gatgggaatt 



ccaaaccaaa atgataatca tatctatctc 



900 



ttatataata 



agtatggtga 



taatgaatta 



agtcattgga agaatgtagg tccaattttt 



960 



ggctataatt 


ctaccgcggt 


ttcacaagaa 


tggtcaggat cagctgtttt gaacagtgat 


1020 


aactctatcc 


aattatttta 


tacaagggta 


gacacgtctg ataacaatac caatcatcaa 


1080 


aaaattgcta 


gcgctactct 


ttatttaact 


gataataatg gaaatgtatc actcgctcag 


1140 


gtacgaaatg 


actatattgt 


atttgaaggt 


gatggctatt actaccaaac ttatgatcaa 


1200 


tggaaagcta 


ctaacaaagg 


tgccgataat 


attgcaatgc gtgatgctca tgtaattgaa 


1260 


gatggtaatg 


gtgatcggta 


ccttgttttt 


gaagcaagta ctggtttgga aaattatcaa 


1320 


ggcgaggacc 


aaatt-bataa 


ctggttaaat 


tatggcggag atgacgcatt taatatcaag 


1380 


agcttattta 


gaattctttc 


caatgatgat 


attaagagtc gggcaacttg ggctaatgca 


1440 


gctatcggta 


tcctcaaact 


aaataaggac 


gaaaagaatc ctaaggtggc agagttatac 


1500 


t caeca ttaa 


tttctgcacc 


aatggtaagc 


gatgaaattg agcgaccaaa tgtagttaaa 


1560 


ttaggtaata 


aatattactt 


atttgccgct 


acccgtttaa atcgaggaag taatgatgat 


1620 


gcttggatga 


atgctaatta 


tgccgttggt 


gataatgttg caatggtcgg atatgttgct 


1680 


gatagtctaa 


ctggatctta 


taagccatta 


aatgattctg gagtagtctt gactgcttct 


1740 


gttcctgcaa 


actggcggac 


agcaacttat 


tcatattatg ctgtccccgt tgccggaaaa 


1800 


gatgaccaag 


tattagttac 


ttcatatatg 


actaatagaa atggagtagc gggtaaagga 


1860 


atggattcaa 


cttgggcacc 


gagtt-tctta 


ctacaaatta acccggataa cacaactact 


1920 


gttttagcta 


aaa-bgactaa 


tcaaggggat 


tggatttggg atgattcaag cgaaaatctt 


1980 


gatatgattg 


gtgatttaga 


ctccgctgct 


ttacctggcg aacgtgataa acctgttgat 


2040 


tgggacttaa 


ttggttatgg 


attaaaaccg 


catgatcctg ctacaccaaa tgatcctgaa 


2100 


acgccaacta 


caccagaaac 


ccctgagaca 


cctaatactc ccaaaacacc aaagactcct 


2160 


gaaaatcctg 


ggacacctca 


aactcctaat 


acacctaata ctccggaaat tcctttaact 


2220 


ccagaaacgc 


ctaagcaacc 


tgaaacccaa 


actaataatc gtttgccaca aactggaaat 


2280 


aatgccaata 


aagccatgat 


tggcctaggt 


at.gggaacat tgcttagtat gtttggtctt 


2340 


gcagaaatta 


acaaacgtcg 


atttaac 




2367 



<210> SEQ ID NO 3 
<2il> LENGTH: 2394 
<212> TYPE: DNA 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 3 

atgctagaac gcaaggaaca taaaaaaatg tataaaagcg gtaaaaattg ggcagtcgtt 60 
acactctcga ctgctgcgct ggtatttggt gcaacaactg taaatgcatc cgcggacaca 120 
aatattgaaa acaatgattc ttctactgta caagttacaa caggtgataa tgatattgct 180 
gttaaaagtg tgacacttgg tagtggtcaa gttagtgcag ctagtgatac gactattaga 240 
acttctgcta atgcaaatag -tgcttcttct gccgctaata cacaaaattc taacagtcaa 300 
gtagcaagtt ctgctgcaat aacatcatct acaagttccg cagcttcatt aaataacaca 360 
gatagtaaag cggctcaaga aaatactaat acagccaaaa atgatgacac gcaaaaagct 420 
gcaccagcta acgaatcttc tgaagctaaa aatgaaccag ctgtaaacgt taatgattct 480 
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tcagctgcaa 


aaaatgatga 


tcaacaatcc 


agtaaaaaga 


atactaccgc 


taagttaaac 


540 


aaggatgctg 


aaaacgttgt 


aaaaaaggcg 


ggaattgatc 


ctaacagttt 


aactgatgac 


600 


cagattaaag 


cattaaataa 


gatgaacttc 


tcgaaagctg 


caaagtctgg 


tacacaaatg 


660 


acttataatg 


atttccaaaa 


gattgctgat 


acgttaatca 


aacaagatgg 


tcggtacaca 


720 


gttccattct 


ttaaagcaag 


tgaaatcaaa 


aatatgcctg 


ccgctacaac 


taaagatgca 


780 


caaactaata 


ctattgaacc 


tttagatgta 


tgggattcat 


ggccagttca 


agatgttcgg 


840 


acaggacaag 


ttgctaattg 


gaatggctat 


caacttgtca 


tcgcaatgat 


gggaattcca 


900 


aaccaaaatg 


ataatcatat 


ctatctctta 


tataataagt 


atggtgataa 


tgaattaagt 


960 


cattggaaga 


atgtaggtcc 


aatttttggc 


tataattcta 


ccgcggtttc 


acaagaatgg 


1020 


tcaggatcag 


ctgttttgaa 


cagtgataac 


tctatccaat 


tattttatac 


aagggtagac 


1080 


acgtctgata 


acaataccaa 


tca-tcaaaaa 


attgctagcg 


ctactcttta 


tttaactgat 


1140 


aataatggaa 


atgtatcact 


cgctcaggta 


cgaaatgact 


atattgtatt 


tgaaggtgat 


1200 


ggctattact 


accaaactta 


tgatcaatgg 


a aa get acta 


acaaaggtgc 


cgataatatt 


1260 


gcaatgcgtg 


atgctcatgt 


aattgaagat 


ggtaatggtg 


atcggtacct 


tgtttttgaa 


1320 


gcaagtactg 


gtttggaaaa 


ttatcaaggc 


gaggaccaaa 


tttataactg 


gttaaattat 


1380 


ggcggagatg 


acgcatttaa 


tatcaagagc 


ttatttagaa 


ttctttccaa 


tgatgatatt 


1440 


aagagtcggg 


caacttgggc 


taa-tgcagct 


atcggtatcc 


tcaaactaaa 


taaggacgaa 


1500 


aagaatccta 


aggtggcaga 


gttatactca 


ccattaattt 


ctgcaccaat 


ggtaagcgat 


1560 


gaaattgagc 


gaccaaatgt: 


agttaaatta 


ggtaataaat 


attacttatt 


tgccgctacc 


1620 


cgtttaaatc 


gaggaagtaa 


tgatgatgct 


tggatgaatg 


ctaattatgc 


cgttggtgat 


1680 


aatgttgcaa 


tggtcggata 


tgttgctgat 


agtctaactg 


gatcttataa 


gccattaaat 


1740 


gattctggag 


tagtcttgac 


tgcttctgtt 


cctgcaaact 


ggcggacagc 


aacttattca 


1800 


tattatgctg 


tccccgttgc 


cggaaaagat 


gaccaagtat 


tagttacttc 


atatatgact 


1860 


aatagaaatg 


gagtagcggg 


'taaaggaa'tg 


ga^xcaac^^ 


gggcaccgag 




1920 


caaattaacc 


cggataacac 


aactactgtt 


ttagctaaaa 


tgactaatca 


aggggattgg 


1980 


atttgggatg 


attcaagcga 


aaatcttgat 


atgattggtg 


atttagactc 


cgctgcttta 


2040 


cctggcgaac 


gtgataaacc 


tgttgattgg 


gacttaattg 


gttatggatt 


aaaaccgcat 


2100 


gatcctgcta 


caccaaatga 


tcctgaaacg 


ccaactacac 


cagaaacccc 


tgagacacct 


2160 


aatactccca 


aaacaccaaa 


gactcctgaa 


aatcctggga 


cacctcaaac 


tcctaataca 


2220 


cctaatactc 


cggaaattcc 


tttaactcca 


gaaacgccta 


agcaacctga 


aacccaaact 


2280 


aataatcgtt 


tgccacaaac 


tggaaataat 


gccaataaag 


ccatgattgg 


cctaggtatg 


2340 


ggaacattgc 


ttagtatgtt 


tggtc-ttgca 


gaaattaaca 


aacgtcgatt 


taac 


2394 



<210> SEQ ID NO 4 
<211> LENGTH: 2592 
<212> TYPE: DNA 

<213> ORGANISM: Lactobacillus reuteri 

<220> FEATURE: 

<221> NAME/KEY: CDS 

<2 2 2> LOCATION: (1)..(51) 

<220> FEATURE: 

<221> NAME /KEY: CDS 

<222> LOCATION: (68).. (24 34) 

<400> SEQUENCE: 4 



tac aat ggg gtg gcg gag gtg aag aaa egg ggt tac ttc tat get aga 



48 
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Tyr Asn Gly Val Ala Glu Val Lys Lys Arg Gly Tyr Phe Tyr Ala Arg 
15 10 15 

acg caaggaacat aaaaaa atg tat aaa age ggt aaa aat tgg gca gtc gtt 100 
Thr Met Tyr Lys Ser Gly Lys Asn Trp Ala Val Val 

20 25 

aca etc teg act get geg etg gta ttt ggt gca aea act gta aat gca 148 
Thr Leu Ser Thr Ala Ala Leu Val Phe Gly Ala Thr Thr Val Asn Ala 

30 35 40 

tec geg gac aca aat att gaa aac aat gat tet tet act gta caa gtt 196 
Ser Ala Asp Thr Asn lie Glu Asn Asn Asp Ser Ser Thr Val Gin Val 
45 50 55 60 

aca aca ggt gat aat gat att get gtt aaa agt gtg aca ctt ggt agt 244 
Thr Thr Gly Asp Asn Asp lie Ala Val Lys Ser Val Thr Leu Gly Ser 
65 70 75 

ggt caa gtt agt gca get agt gat acg act att aga act tet get aat 292 
Gly Gin Val Ser Ala Ala Ser Asp Thr Thr lie Arg Thr Ser Ala Asn 

80 85 90 

gca aat agt get tet tet gee get aat aea caa aat tet aac agt caa 340 
Ala Asn Ser Ala Ser Ser Ala Ala Asn Thr Gin Asn Ser Asn Ser Gin 
95 100 105 

gta gca agt tet get gca ata aca tea tet aea agt tee gca get tea 388 
Val Ala Ser Ser Ala Ala lie Thr Ser Ser Thr Ser Ser Ala Ala Ser 
110 115 120 

tta aat aac aea gat agt aaa geg get caa gaa aat act aat aca gcc 436 
Leu Asn Asn Thr Asp Ser Lys Ala Ala Gin Glu Asn Thr Asn Thr Ala 
125 130 135 140 

aaa aat gat gac acg caa aaa get gca cca get aac gaa tet tet gaa 484 
Lys Asn Asp Asp Thr Gin Lys Ala Ala Pro Ala Asn Glu Ser Ser Glu 
145 150 155 

get aaa aat gaa cca get gta aac gtt aat gat tet tea get gca aaa 532 
Ala Lys Asn Glu Pro Ala Val Asn Val Asn Asp Ser Ser Ala Ala Lys 
160 165 170 

aat gat gat caa caa tec agt aaa aag aat act acc get aag tta aac 580 
Asn Asp Asp Gin Gin Ser Ser Lys Lys Asn Thr Thr Ala Lys Leu Asn 

175 180 185 

aag gat get gaa aac gtt gta aaa aag geg gga att gat cet aac agt 628 
Lys Asp Ala Glu Asn Val Val Lys Lys Ala Gly lie Asp Pro Asn Ser 
190 195 200 

tta act gat gac cag att aaa gca tta aat aag atg aac ttc teg aaa 676 
Leu Thr Asp Asp Gin lie Lys Ala Leu Asn Lys Met Asn Phe Ser Lys 
205 210 215 220 

get gca aag tot ggt aca caa atg act tat aat gat ttc caa aag att 724 
Ala Ala Lys Ser Gly Thr Gin Met Thr Tyr Asn Asp Phe Gin Lys He 

225 230 235 

get gat acg tta ate aaa caa gat ggt egg tac aea gtt cca ttc ttt 772 
Ala Asp Thr Leu He Lys Gin Asp Gly Arg Tyr Thr Val Pro Phe Phe 
240 245 250 

aaa gca agt gaa ate aaa aat atg cet gcc get aca act aaa gat gca 820 
Lys Ala Ser Glu He Lys Asn Met Pro Ala Ala Thr Thr Lys Asp Ala 
255 260 265 

caa act aat act att gaa cet tta gat gta tgg gat tea tgg cca gtt 868 
Gin Thr Asn Thr He Glu Pro Leu Asp Val Trp Asp Ser Trp Pro Val 
270 275 280 

caa gat gtt egg aca gga caa gtt get aat tgg aat ggc tat caa ctt 916 
Gin Asp Val Arg Thr Gly Gin Val Ala Asn Trp Asn Gly Tyr Gin Leu 
285 290 295 300 



gtc ate gca atg atg gga att cca aac caa aat gat aat cat ate tat 
Val He Ala Met Met Gly He Pro Asn Gin Asn Asp Asn His He Tyr 
305 310 315 



964 
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etc tta tat aat aag tat ggt gat aat gaa tta agt cat tgg aag aat 1012 
Leu Leu Tyr Asn Lys Tyr Gly Asp Asn Glu Leu Ser His Trp Lys Asn 
320 325 330 

gta ggt oca att ttt ggc tat aat tot acc gcg gtt tea caa gaa tgg 1060 
Val Gly Pro lie Phe Gly Tyr Asn Ser Thr Ala Val Ser Gin Glu Trp 
335 340 345 

tea gga tea get gtt ttg aac agt gat aac tot ate caa tta ttt tat 1108 
Ser Gly Ser Ala Val Leu Asn Ser Asp Asn Ser lie Gin Leu Phe Tyr 
350 355 360 

aca agg gta gac acg tct gat aac aat acc aat cat caa aaa att get 1156 
Thr Arg Val Asp Thr Ser Asp Asn Asn Thr Asn His Gin Lys lie Ala 

365 370 375 380 

age get act ctt tat tta act gat aat aat gga aat gta tea etc get 1204 
Ser Ala Thr Leu Tyr Leu Thr Asp Asn Asn Gly Asn Val Ser Leu Ala 
385 390 395 

eag gta cga aat gac tat att gta ttt gaa ggt gat ggc tat tac tac 1252 
Gin Val Arg Asn Asp Tyr lie Val Phe Glu Gly Asp Gly Tyr Tyr Tyr 
400 405 410 

caa act tat gat caa tgg aaa get act aac aaa ggt gee gat aat att ' 1300 
Gin Thr Tyr Asp Gin Trp Lys Ala Thr Asn Lys Gly Ala Asp Asn lie 

415 420 425 

gca atg cgt gat get eat gta att gaa gat ggt aat ggt gat egg tac 1348 
Ala Met Arg Asp Ala His Val lie Glu Asp Gly Asn Gly Asp Arg Tyr 
430 435 440 

ctt gtt ttt gaa gca agt act ggt ttg gaa aat tat caa ggc gag gac 1396 
Leu Val Phe Glu Ala Ser Thr Gly Leu Glu Asn Tyr Gin Gly Glu Asp 
445 450 455 460 

caa att tat aac tgg tta aat tat ggc gga gat gac gca ttt aat ate 1444 
Gin lie Tyr Asn Trp Leu Asn Tyr Gly Gly Asp Asp Ala Phe Asn Zle 

465 470 475 

aag age tta ttt aga att ctt tec aat gat gat att aag agt egg gca 1492 
Lys Ser Leu Phe Arg lie Leu Ser Asn Asp Asp lie Lys Ser Arg Ala 
480 485 490 

act tgg get aat gca get ate ggt ate etc aaa eta aat aag gac gaa 1540 
Thr Trp Ala Asn Ala Ala lie Gly lie Leu Lys Leu Asn Lys Asp Glu 
495 500 505 

aag aat cct aag gtg gca gag tta tac tea cca tta att tct gca cea 1588 
Lys Asn Pro Lys Val Ala Glu Leu Tyr Ser Pro Leu lie Ser Ala Pro 

510 515 520 

atg gta age gat gaa att gag cga cca aat gta gtt aaa tta ggt aat 1636 
Met Val Ser Asp Glu lie Glu Arg Pro Asn Val Val Lys Leu Gly Asn 
525 530 535 540 

aaa tat tac tta ttt gcc get acc cgt tta aat cga gga agt aat gat 1684 
Lys Tyr Tyr Leu Phe Ala Ala Thr Arg Leu Asn Arg Gly Ser Asn Asp 
545 , 550 555 

gat get tgg atg aat get aat tat gcc gtt ggt gat aat gtt gca atg 1732 
Asp Ala Trp Met Asn Ala Asn Tyr Ala Val Gly Asp Asn Val Ala Met 

560 565 570 

gtc gga tat gtt get gat agt eta act gga tct tat aag cca tta aat 1780 
Val Gly Tyr Val Ala Asp Ser Leu Thr Gly Ser Tyr Lys Pro Leu Asn 
575 580 585 

gat tct gga gta gtc ttg act get tct gtt cct gca aac tgg egg aca 1828 
Asp Ser Gly Val Val Leu Thr Ala Ser Val Pro Ala Asn Trp Arg Thr 
590 595 600 

gca act tat tea tat tat get gtc ccc gtt gcc gga aaa gat gac caa 1876 
Ala Thr Tyr Ser Tyr Tyr Ala Val Pro Val Ala Gly Lys Asp Asp Gin 
605 610 615 620 

gta tta gtt act tea tat atg act aat aga aat gga gta gcg ggt aaa 1924 
Val Leu Val Thr Ser Tyr Met Thr Asn Arg Asn Gly Val Ala Gly Lys 
625 630 635 
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gga atg gat tea act tgg gca ccg agt ttc tta eta caa att aac ccg 

Gly Met Asp Ser Thr Trp Ala Pro Ser Fhe Leu Leu Gin lie Asn Fro 
640 645 650 

gat aac aca act act gtt tta get aaa atg act aat caa ggg gat tgg 

Asp Asn Thr Thr Thr Val Leu Ala Lys Met Thr Asn Gin Gly Asp Trp 

655 660 665 



att tgg gat gat tea age gaa aat ctt gat atg att ggt gat tta gac 
lie Trp Asp Asp Ser Ser Glu Asn Leu Asp Met lie Gly Asp Leu Asp 
670 675 680 



2068 



tec get get tta ect gge gaa egt gat aaa cct gtt gat tgg gae tta 
Ser Ala Ala Leu Pro Gly Glu Arg Asp Lys Pro Val Asp Trp Asp Leu 
685 690 695 700 

att ggt tat gga tta aaa ccg cat gat cct get aea eea aat gat cct 
lie Gly Tyr Gly Leu Lys Pro His Asp Pro Ala Thr Pro Asn Asp Pro 
705 710 715 



gaa aeg eea act aea eea gaa aec ect gag aca cct aat act cec aaa 
Glu Thr Pro Thr Thr Pro Glu Thr Pro Glu Thr Pro Asn Thr Pro Lys 
720 725 730 



2212 



aea eea aag act cct gaa aat ect ggg aca cct caa act cct aat aca 
Thr Pro Lys Thr Pro Glu Asn Pro Gly Thr Pro Gin Thr Pro Asn Thr 
735 740 745 



2260 



cct aat act ccg gaa att cct tta act eea gaa acg cct aag caa cct 

Pro Asn Thr Pro Glu lie Pro Leu Thr Pro Glu Thr Pro Lys Gin Pro 

750 755 760 



2308 



gaa ace caa act aat aat egt ttg eea caa act gga aat aat gcc aat 
Glu Thr Gin Thr Asn Asn Arg Leu Pro Gin Thr Gly Asn Asn Ala Asn 
765 770 775 780 

aaa gee atg att ggc eta ggt atg gga aca ttg ctt agt atg ttt ggt 
Lys Ala Met lie Gly Leu Gly Met Gly Thr Leu Leu Ser Met Phe Gly 
785 790 795 



ctt gca gaa att aac aaa egt cga ttt aac taaatacttt aaaataaaac 
Leu Ala Glu lie Asn Lys Arg Arg Phe Asn 
800 805 



2454 



cgctaagcct taaattcagc ttaacggttt tttattttaa aagtttttat tgtaaaaaag 2514 
cgaattatea ttaatactaa tgcaattgtt gtaagaeett acgacagtag taaeaatgaa 257 4 
tttgcccatc tttgtegg 2592 



<210> SEQ ID NO 5 
<211> LENGTH: 5 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 

<220> FEATURE: 

<221> NAME /KEY: MOD_RES 

<222> LOCATION: (3) 

<223> OTHER INFORMATION: Any amino acid 
<400> SEQUENCE: 5 

Leu Pro Xaa Thr Gly 

1 5 



<210> SEQ ID NO 6 
<211> LENGTH: 27 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 6 

Ala Gin Val Glu Ser Asn Asn Tyr Asn Gly Val Ala Glu Val Asn Thr 
15 10 15 



Glu Arg Gin Ala Asn Gly Gin lie Gly Val Asp 
20 25 
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<210> SEQ ID NO 7 
<211> LENGTH: 16 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus ireuteri 
<400> SEQUENCE: 7 

Met Ala His Leu Asp Val Trp Asp Ser Trp Pro Val Gin Asp Pro Val 
15 10 15 



<210> SEQ ID NO 8 

<211> LENGTH: 9 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 8 

Asn Ala Gly Ser lie Phe Gly Thr Lys 
1 5 



<210> SEQ ID NO 9 
<211> LENGTH: 19 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 9 

Val Glu Glu Val Tyr Ser Pro Lys Val Ser Thr Leu Met Ala Ser Asp 
15 10 15 

Glu Val Glu 



<210> SEQ ID NO 10 
<211> LENGTH: 4634 
<212> TYPE: DNA 

<213> ORGANISM: Lactobacillus reuteri 

<220> FEATURE: 

<221> NAME /KEY: CDS 

<222> LOCATION: ( 1220 )..( 3598 ) 

<220> FEATURE: 

<221> NAME/KEY: RBS 

<2 22> LOCATION: ( 1205 )..( 1210 ) 

<220> FEATURE: 

<221> NAME/KEY: modified_ba8e 
<222> LOCATION: ( 2702 ).. (2707 ) 

<223> OTHER INFORMATION: a, c, t, g, Other or unknown 
<2 20> FEATURE: 

<221> NAME/KEY: modified_base 
<222> LOCATION; ( 3686 )..( 3698 ) 

<223> OTHER INFORMATION: a, c, t, g, Other or unknown 
<400> SEQUENCE; 10 

gttaacaaag acaaaatttt atataattct tcaaattaaa tttcccactg taagaacata 60 
aatgggtacc tgtttgatgg gaataatata tttgtaacta accggccggc acctctttct 120 
aatgtgccta ggatgcataa tggatgtaaa ttactagatg gcggttttta tacattaacc 180 
tcgcaggaga gaaaagaagc aattagtaag gatccatatg cagataaatt tattaggcct 240 
tatttaggtg ctaaaaattt cattcatgga actgctaggt actgtatttg gttaaaggac 300 
gcaaacccga aagatatcca tcaatcgcca tttatactgg atagaatcaa taaagtagcg 360 
gaattcagat cgcagcaaaa aagtaaagat acacaaaaat atgcaaaacg gcccatgcta 420 
acaacacgac ttgcctatta tagccacgat gtacatacgg atatgctgat agtacctgca 480 
acatcatcgc aacgtagaga atatcttcca attggatatg tttcagaaaa gaatattgtg 540 



tcttattcac taatgctaat ccccaatgct agtaatttta atttcggtat tctagaatct 600 
aaagttcact atatttggtt aaaaaacttt tgcggtcggt tgaagtccga ttatcgttat 660 
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tcaaacacta 


ttatttataa 


taatttccct 


tggccgactg 


ttggtgacaa 


gccaggamca 


720 


acaccatctc 


tgacactcgc 


tcaaggtata 


ttaaatactc 


gcaagctcta 


tccagacagc 


780 


tcactggctg 


atctttatga 


tccactaaca 


atgccragt-b 


gaactcgtaa 


agctcatgaa 


840 


gccaatgata 


aagctgttct 


taaagcatat 


gga^tgagcc 


ctaaagctac 


tgagcaagaa 


900 


atcgtagaac 


atctatttaa 


gatgtatgaa 


aaactgacta 


aaggtgaaag 


ataactttgt 


960 


aaaaccaata 


ttttataaag 


acagtaaatg 


ttaatttgat 


aaaaacatat 


atttaataaa 


1020 


caaaagtga-b 


a^aatcaagt 


agttctttgt 


at'tacaaaa-b 


acatttaata 


tctctcagca 


1080 


ttttgcatac 


tgggagattt 


tttattgaca 


aattgtttga 


aagtgcttat 


gatgaaaccg 


1140 


tgtagaaact 


aattcaatt-t 


gataaacgtt 


agacatttct 


gaggaggaag 


tcattttgga 


1200 



gtacaaagaa cataagaaa atg tat aaa gtc ggc aag aat tgg gcc gtt get 1252 
Met Tyr Lys Val Gly Lys Asn Trp Ala Val Ala 

15 10 

aca ttg gta tea get tea att tta atg gga ggg gtt gta aec get cat 1300 
Thr Leu Val Ser Ala Ser lie Leu Met Gly Gly Val Val Thr Ala His 
15 20 25 

get gat caa gta gaa agt aac aat tac aac ggt gtt get gaa gtt aat 1348 
Ala Asp Gin Val Glu Ser Asn Asn Tyr Asn Gly Val Ala Glu Val Asn 
30 35 40 

aet gaa cgt caa get aat ggt caa att ggc gta gat gga aaa att att 1396 
Thr Glu Arg Gin Ala Asn Gly Gin lie Gly Val Asp Gly Lys He He 

45 50 55 

agt get aae agt aat aea ace agt ggc teg aca aat caa gaa tea tct 144 4 

Ser Ala Asn Ser Asn Thr Thr Ser Gly Ser Thr Asn Gin Glu Ser Ser 
60 65 70 75 

get act aae aat act gaa aat get gtt gtt aat gaa age aaa aat aet 1492 
Ala Thr Asn Asn Thr Glu Asn Ala Val Val Asn Glu Ser Lys Asn Thr 
80 85 90 

aac aat act gaa aat get gtt gtt aat gaa aac aaa aat act aac aat 1540 
Asn Asn Thr Glu Asn Ala Val Val Asn Glu Asn Lys Asn Thr Asn Asn 

95 100 105 

aet gaa aat get gtt gtt aat gaa aae aaa aat act aae aae aca gaa 1588 
Thr Glu Asn Ala Val Val Asn Glu Asn Lys Asn Thr Asn Asn Thr Glu 
110 115 120 

aac gat aat agt caa tta aag tta act aat aat gaa caa cca tea gcc 1636 
Asn Asp Asn Ser Gin Leu Lys Leu Thr Asn Asn Glu Gin Pro Ser Ala 
125 130 135 

get act caa gca aac ttg aag aag eta aat cct caa get get aag get 1684 
Ala Thr Gin Ala Asn Leu Lys Lys Leu Asn Pro Gin Ala Ala Lys Ala 
140 145 150 155 

gtt caa aat gcc aag att gat gcc ggt agt tta aca gat gat caa att 1732 
Val Gin Asn Ala Lys He Asp Ala Gly Ser Leu Thr Asp Asp Gin He 
160 165 170 

aat gaa tta aat aag att aac ttc tct aag tct get gaa aag ggt gca 1780 
Asn Glu Leu Asn Lys He Asn Phe Ser Lys Ser Ala Glu Lys Gly Ala 
175 180 185 

aaa ttg acc ttt aag gac tta gag ggg att ggt aat get att gtt aag 1828 
Lys Leu Thr Phe Lys Asp Leu Glu Gly He Gly Asn Ala He Val Lys 

190 195 200 

caa gat cca caa tat get att cct tat tct aat get aag gaa ate aag 1876 
Gin Asp Pro Gin Tyr Ala He Pro Tyr Ser Asn Ala Lys Glu He Lys 
205 210 215 

aat atg cct gca aca tac act gta gat gee caa aca ggt aag atg get 1924 
Asn Met Pro Ala Thr Tyr Thr Val Asp Ala Gin Thr Gly Lys Met Ala 
220 225 230 235 
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cat ctt gat gtc tgg gac tct tgg cca gta caa gat cct gtc aca ggt 1972 
His Leu Asp Val Trp Asp Ser Trp Pro Val Gin Asp Pro Val Thr Gly 
240 245 250 

tat gta tct aat tac atg ggt tat caa eta gtt att get atg atg ggt 2020 
Tyr Val Ser Asn Tyr Met Gly Tyr Gin Leu Val lie Ala Met Met Gly 
255 260 " 265 

att cca aat teg cca act gga gat aat cat ate tat ctt ctt tac aac 2068 
lie Pro Asn Ser Fro Thr Gly Asp Asn His He Tyr Leu Leu Tyr Asn 
270 275 280 

aag tat ggt gat aat gac ttt tct cat tgg cgc aat gca ggt tea ate 2116 
Lys Tyr Gly Asp Asn Asp Phe Ser His Trp Arg Asn Ala Gly Ser He 
285 290 295 

ttt gga act aaa gaa aca aat gtg ttc caa gaa tgg tea ggt tea get 2164 
Phe Gly Thr Lys Glu Thr Asn Val Phe Gin Glu Trp Ser Gly Ser Ala 
300 305 310 315 

att gta aat gat gat ggt aca att caa eta ttt tte ace tea aat gat 2212 
He Val Asn Asp Asp Gly Thr He Gin Leu Phe Phe Thr Ser Asn Asp 
320 325 330 

acg tct gat tac aag ttg aat gat caa cgc ctt get acc gca aca tta 2260 
Thr Ser Asp Tyr Lys Leu Asn Asp Gin Arg Leu Ala Thr Ala Thr Leu 

335 340 345 

aac ctt aat gtt gat gat aac ggt gtt tea ate aag agt gtt gat aat 2308 
Asn Leu Asn Val Asp Asp Asn Gly Val Ser He Lys Ser Val Asp Asn 
350 355 360 

tat caa gtt ttg ttt gaa ggt gat gga ttt cac tac caa act tat gaa 2356 
Tyr Gin Val Leu Phe Glu Gly Asp Gly Phe His Tyr Gin Thr Tyr Glu 
365 370 375 

caa ttc gca aac ggc aaa gat egt gaa aat gat gat tac tge tta cgt 2404 
Gin Phe Ala Asn Gly Lys Asp Arg Glu Asn Asp Asp Tyr Cys Leu Arg 

380 385 390 395 

gac cca cac gtt gtt caa tta gaa aat ggt gat cgt tat ctt gta ttc 2452 
Asp Pro His Val Val Gin Leu Glu Asn Gly Asp Arg Tyr Leu Val Phe 
400 405 410 

gaa get aat act ggg aca gaa gat tac caa agt gac gac caa att tat 2500 
Glu Ala Asn Thr Gly Thr Glu Asp Tyr Gin Ser Asp Asp Gin He Tyr 
415 420 425 

aat tgg get aac tat ggt ggc gat gat gee ttc aat att aag agt tec 2548 
Asn Trp Ala Asn Tyr Gly Gly Asp Asp Ala Phe Asn He Lys Ser Ser 
430 435 440 

ttc aag ctt ttg aat aat aag aag gat cgt gaa ttg get ggt tta get 2596 
Phe Lys Leu Leu Asn Asn Lys Lys Asp Arg Glu Leu Ala Gly Leu Ala 
445 450 455 

aat ggt gca ctt ggt ate tta aag etc act aac aat caa agt aag cca 2644 
Asn Gly Ala Leu Gly He Leu Lys Leu Thr Asn Asn Gin Ser Lys Pro 
460 465 470 475 

aag gtt gaa gaa gta tac tea cca ttg gta tct act ttg atg get tge 2692 
Lys Val Glu Glu Val Tyr Ser Pro Leu Val Ser Thr Leu Met Ala Cys 
480 485 490 

gat gag gta nnn nnn aag ctt ggt gat aag tat tat etc ttc tec gta 27 40 
Asp Glu Val Xaa Xaa Lys Leu Gly Asp Lys Tyr Tyr Leu Phe Ser Val 
495 500 505 

act cgt gta agt cgt ggt tec gat cgt gaa tta acc get aag gat aac 2788 
Thr Arg Val Ser Arg Gly Ser Asp Arg Glu Leu Thr Ala Lys Asp Asn 
510 515 520 

aca ate gtt ggt gat aac gtt get atg att ggt tac gtt tec gat age 2836 
Thr He Val Gly Asp Asn Val Ala Met He Gly Tyr Val Ser Asp Ser 

525 530 535 

tta atg ggt aag tac aag cca tta aat aac tea ggt gtc gta tta act 2884 
Leu Met Gly Lys Tyr Lys Pro Leu Asn Asn Ser Gly Val Val Leu Thr 
540 545 550 555 
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gca tea gta cct gca aac tgg cgt act get act 
Ala Ser Val Pro Ala Asn Trp Arg Thr Ala Thr 
560 565 



tat tec tac tat gca 
Tyr Ser Tyr Tyr Ala 
570 



2932 



gta cct gta get ggt cat cct gat caa gta tta 
Val Pro Val Ala Gly His Pro Asp Gin Val Leu 
575 580 



att act tct tac atg 
He Thr Ser Tyr Met 
585 



2980 



agt aac aag gac ttt get tea ggt gaa gga aac 
Ser Aen Lys Asp Phe Ala Ser Gly Glu Gly Asn 
590 595 



tat gca act tgg gca 
Tyr Ala Thr Trp Ala 
600 



3028 



oca agt ttc tta gta caa ate aat cea gat gac 
Pro Ser Phe Leu Val Gin He Asn Pro Asp Asp 
605 610 



acg aca act gta tta 
Thr Thr Thr Val Leu 
615 



3076 



gca cgt gca act aac caa ggt gac tgg gtg tgg 
Ala Arg Ala Thr Asn Gin Gly Asp Trp Val Trp 
620 625 630 



gac gac tct agt egg 
Asp Asp Ser Ser Arg 
635 



aac gat aat atg etc ggt gtt ctt aaa gaa ggt 
Asn Asp Asn Met Leu Gly Val Leu Lys Glu Gly 
640 645 



gca get aac agt gee 
Ala Ala Asn Ser Ala 

650 



gcc tta cea ggt gaa tgg ggt aag cca gtt gac 
Ala Leu Pro Gly Glu Trp Gly Lys Pro Val Asp 
655 660 



tgg agt ttg att aac 
Trp Ser Leu He Asn 
665 



aga agt cct ggc tta ggc tta aag cct cat caa 
Arg Ser Pro Gly Leu Gly Leu Lys Pro His Gin 

670 675 



cca gtt caa cca aag 

Pro Val Gin Pro Lys 

680 



att gat caa cct gat caa caa cct tct ggt caa 
He Asp Gin Pro Asp Gin Gin Pro Ser Gly Gin 
685 690 



aac act aag aat gtc 
Asn Thr Lys Asn Val 
695 



aca cca ggt aat ggt gat aag cct get ggt aag 
Thr Pro Gly Asn Gly Asp Lys Pro Ala Gly Lys 
700 705 710 



gca act cct gat aac 
Ala Thr Pro Asp Asn 
715 



act aat att gat cca agt gca caa cct tct ggt 
Thr Asn He Asp Pro Ser Ala Gin Pro Ser Gly 
720 725 



caa aac act aat att 
Gin Asn Thr Asn He 
730 



3412 



gat cca agt gca caa met tct ggt caa aac act 
Asp Pro Ser Ala Gin Xaa Ser Gly Gin Asn Thr 
735 740 



aag aat gtc aca cca 
Lys Asn Val Thr Pro 
745 



ggt aat gag aaa caa ggt aag aat acc gat gca 
Gly Asn Glu Lys Gin Gly Lys Asn Thr Asp Ala 
750 755 



aaa caa tta cca caa 
Lys Gin Leu Pro Gin 
760 



aca ggt aat aag tct ggt tta gca gga ctt tac 
Thr Gly Asn Lys Ser Gly Leu Ala Gly Leu Tyr 
765 770 



get ggt tea tta ctt 
Ala Gly Ser Leu Leu 

775 



gcc ttg ttt gga ttg gca gca att gaa aag cgt 
Ala Leu Phe Gly Leu Ala Ala He Glu Lys Arg 



cac get taa 
His Ala 



780 


785 


790 








tagagtaaaa 


aaacatcctc 


cactcaagtt 


acaagtagga 


taatatgtat 


tatttctacg 


3658 


cytagteaag 


aggrattact 


ggacatannn 


nnnnnnnnnn 


tecagttacc 


aagtggaata 


3718 


tagtattatt 


ccacgctagt 


caggaggatt 


aetgacatta 


ttggetacat 


ggceggtagt 


3778 


cctcttttct 


tttgtgacga 


attgtcaaac 


caagtgcaac 


ggtttcteaa 


aaaacacctc 


3838 


atatggggtt 


tcataattta 


acacttttcg 


aggacggcgg 


ttcagctgat 


gttggcagaa 


3898 


actgacgtcc 


ttatctgtat 


aatcatcaat 


attagccctt 


ttaggaaagt 


attccetaat 


3958 


tagaccattg 


gtattttcat 


tgggtcctct 


ttcctetggt 


gaatagggat 


ctggceaata 


4018 


gatagctact 


cetaaacgtc 


ctcgaatatc 


attcaagcca 


agaaattcac 


geccatgatc 


4078 


tggagtcaat 


gaatggacaa 


attctttagg 


aatagaccct 


aagagatcaa 


ttaagccctg 


4138 
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atattrgaat: 


'tcggagaagg 


ggag^t.g1^cc aacaa'btgcc g'tlia'taa'tac caggg^taa't 


4 198 


acggccctgg 


gcctctacgg 


taatattgta tttttggctc agatcagtga tagaaaccca 


4258 


cagatttagc 


ttgccggtgg 


agtgctgctt gaagtcttca attacttcgt taccatgttt 


4318 


gattgctaat 


ctgatgtgtc 


gttgttgtgg tgtagtaggc atcataccac ctcctcataa 


4378 


aataaggtat 


aacaggaatt 


tcttgtacta tatgatcctt ccaatataat aatattaggc 


4438 


cgataagaaa 


tgaccagcta 


ccatttcttg atgcttagtg aatataatcg gatgatacgt 


4498 


cacccctcaa 


caatccaatt 


tcacggaggt gagtaatcat gccgagagct aggaatgatt 


4558 


ggaggoacga 


acacggtcca 


tgcggcagtg gctatttgga ttttagccaa agcagcgtta 


4618 


ctgcttgcaa 


aagctt 




4634 



<210> SEQ ID NO 11 
<211> LENGTH: 792 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 

<220> FEATURE: 

<221> NAME/KEY: MOD_HES 

<222> LOCATION: < 495).. (4 96) 

<223> OTHER INFORMATION: Any amino acid 

<220> FEATURE: 

<221> NAME/KEY: MOD_RES 

<222> LOCATION: (737) 

<223> OTHER INFORMATION: Thr or Pro 

<400> SEQUENCE: 11 

Met Tyr Lys Val Gly Lys Asn Trp Ala Val Ala Thr Leu Val Ser Ala 
15 10 15 

Ser He Leu Met Gly Gly Val Val Thr Ala His Ala Asp Gin Val Glu 

20 25 30 

Ser Asn Asn Tyr Asn Gly Val Ala Glu Val Asn Thr Glu Arg Gin Ala 
35 40 45 

Asn Gly Gin He Gly Val Asp Gly Lys He He Ser Ala Asn Ser Asn 
50 55 60 

Thr Thr Ser Gly Ser Thr Asn Gin Glu Ser Ser Ala Thr Asn Asn Thr 
65 70 75 80 

Glu Asn Ala Val Val Asn Glu Ser Lys Asn Thr Asn Asn Thr Glu Asn 

85 90 95 

Ala Val Val Asn Glu Asn Lys Asn Thr Asn Asn Thr Glu Asn Ala Val 
100 105 110 

Val Asn Glu Asn Lys Asn Thr Asn Asn Thr Glu Asn Asp Asn Ser Gin 
115 120 125 

Leu Lys Leu Thr Asn Asn Glu Gin Pro Ser Ala Ala Thr Gin Ala Asn 
130 135 140 

Leu Lys Lye Leu Asn Pro Gin Ala Ala Lys Ala Val Gin Asn Ala Lys 

145 150 155 160 

He Asp Ala Gly Ser Leu Thr Asp Asp Gin He Asn Glu Leu Asn Lys 
165 170 175 

He Asn Phe Ser Lys Ser Ala Glu Lys Gly Ala Lys Leu Thr Phe Lys 
180 185 190 

Asp Leu Glu Gly He Gly Asn Ala He Val Lys Gin Asp Pro Gin Tyr 
195 200 205 

Ala He Pro Tyr Ser Asn Ala Lys Glu He Lys Asn Met Pro Ala Thr 

210 215 220 



Tyr 
225 



Thr Val Asp Ala Gin Thr 
230 



Gly Lys Met Ala His Leu Asp Val Trp 
235 240 
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Asp Ser Trp Pro Val Gin Asp Pro Val Thr Gly Tyr Val Ser Asn Tyr 
245 250 255 

Met Gly Tyr Gin Leu Val lie Ala Met Met Gly lie Pro Asn Ser Pro 
260 265 270 

Thr Gly Asp Asn His lie Tyr Leu Leu Tyr Asn Lys Tyr Gly Asp Asn 
275 280 285 

Asp Phe Ser His Trp Arg Asn Ala Gly Ser He Phe Gly Thr Lys Glu 
290 295 300 

Thr Asn Val Phe Gin Glu Trp Ser Gly Ser Ala He Val Asn Asp Asp 

305 310 315 320 

Gly Thr He Gin Leu Phe Phe Thr Ser Asn Asp Thr Ser Asp Tyr Lys 
325 330 335 

Leu Asn Asp Gin Arg Leu Ala Thr Ala Thr Leu Asn Leu Asn Val Asp 
340 345 350 

Asp Asn Gly Val Ser He Lys Ser Val Asp Asn Tyr Gin Val Leu Phe 
355 360 365 

Glu Gly Asp Gly Phe His Tyr Gin Thr Tyr Glu Gin Phe Ala Asn Gly 
370 375 380 

Lys Asp Arg Glu Asn Asp Asp Tyr Cys Leu Arg Asp Pro His Val Val 
385 390 395 400 

Gin Leu Glu Asn Gly Asp Arg Tyr Leu Val Phe Glu Ala Asn Thr Gly 

405 410 415 

Thr Glu Asp Tyr Gin Ser Asp Asp Gin He Tyr Asn Trp Ala Asn Tyr 
420 425 430 

Gly Gly Asp Asp Ala Phe Asn He Lys Ser Ser Phe Lys Leu Leu Asn 
435 440 445 

Asn Lys Lys Asp Arg Glu Leu Ala Gly Leu Ala Asn Gly Ala Leu Gly 
450 455 460 

He Leu Lys Leu Thr Asn Asn Gin Ser Lys Pro Lys Val Glu Glu Val 
465 470 475 480 

Tyr Ser Pro Leu Val Ser Thr Leu Met Ala Cys Asp Glu Val Xaa Xaa 
485 490 495 

Lys Leu Gly Asp Lys Tyr Tyr Leu Phe Ser Val Thr Arg Val Ser Arg 
500 505 510 

Gly Ser Asp Arg Glu Leu Thr Ala Lys Asp Asn Thr He Val Gly Asp 
515 520 525 

Asn Val Ala Met He Gly Tyr Val Ser Asp Ser Leu Met Gly Lys Tyr 

530 535 540 

Lys Fro Leu Asn Asn Ser Gly Val Val Leu Thr Ala Ser Val Pro Ala 
545 550 555 560 

Asn Trp Arg Thr Ala Thr Tyr Ser Tyr Tyr Ala Val Pro Val Ala Gly 
565 570 575 

His Fro Asp Gin Val Leu He Thr Ser Tyr Met Ser Asn Lys Asp Phe 
580 585 590 

Ala Ser Gly Glu Gly Asn Tyr Ala Thr Trp Ala Pro Ser Phe Leu Val 

595 600 605 

Gin He Asn Pro Asp Asp Thr Thr Thr Val Leu Ala Arg Ala Thr Asn 
610 615 620 

Gin Gly Asp Trp Val Trp Asp Asp Ser Ser Arg Asn Asp Asn Met Leu 
625 630 635 640 

Gly Val Leu Lys Glu Gly Ala Ala Asn Ser Ala Ala Leu Pro Gly Glu 
645 650 655 

Trp Gly Lys Pro Val Asp Trp Ser Leu He Asn Arg Ser Pro Gly Leu 
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660 665 670 

Gly Leu Lys Pro His Gin Pro Val Gin Pro Lys lie Asp Gin Pro Asp 
675 680 685 

Gin Gin Pro Ser Gly Gin Asn Thr Lys Asn Val Thr Pro Gly Asn Gly 
690 695 700 

Asp Lys Pro Ala Gly Lys Ala Thr Pro Asp A&n Thr Asn lie Asp Pro 
705 710 715 720 

Ser Ala Gin Fro Ser Gly Gin Asn Thr Asn lie Asp Pro Ser Ala Gin 
725 730 735 

Xaa Ser Gly Gin Asn Thr Lys Asn Val Thr Pro Gly Asn Glu Lys Gin 
740 745 750 

Gly Lys Asn Thr Asp Ala Lys Gin Leu Pro Gin Thr Gly Asn Lys Ser 
755 760 765 

Gly Leu Ala Gly Leu Tyr Ala Gly Ser Leu Leu Ala Leu Phe Gly Leu 

770 775 780 

Ala Ala lie Glu Lys Arg His Ala 
785 790 



<210> SEQ ID NO 12 
<211> LENGTH: 24 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description 
<400> SEQUENCE: 12 
ctgataataa tggaaatgta tcac 



of Artificial Sequence: Primer 



24 



<210> SEQ ID NO 13 
<211> LENGTH: 26 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 13 

catgatcata agtttggtag taatag 26 



<210> SEQ ID NO 14 
<211> LENGTH: 24 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 14 

gtgatacatt tccattatta tcag 24 



<210> SEQ ID NO 15 
<211> LENGTH: 26 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 15 

ctattactac caaacttatg atcatg 26 



<210> SEQ ID NO 16 
<211> LENGTH: 38 
<212> TYPE; DNA 

<213> ORGANISM: Artificial Sequence 
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<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 16 

ccatggccat ggtagaacgc aaggaacata aaaaaatg 38 



<210> SEQ ID NO 17 
<211> LENGTH: 38 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 17 

agatctagat ctgttaaatc gacgtttgtt aatttctg 38 



<210> SEQ ID NO 18 

<211> LENGTH: 21 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 

<220> FEATURE: 

<221> NAME /KEY: modified_base 

<222> LOCATION: (6) 

<223> OTHER INFORMATION: a, c, t, g, other or unknown 

<220> FEATURE: 

<221> NAME/KEY: modifiedjbtase 

<222> LOCATION: (15) 

<223> OTHER INFORMATION: a, c, t, g, other or unknown 

<400> SEQUENCE: 18 

gaygtntggg aywsntgggc c 21 



<210> SEQ ID NO 19 

<211> LENGTH: 23 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<221> NAME /KEY: modified_base 

<222> LOCATION: (3) 

<223> OTHER INFORMATION: a, c, t, g, other or unknown 

<220> FEATURE: 

<221> NAME /KEY: modifiediiase 

<222> LOCATION: (6) 

<223> OTHER INFORMATION: a, c, t, g, other or unknown 

<220> FEATURE: 

<221> NAME /KEY: modified_ba6e 

<222> LOCATION: (9) 

<223> OTHER INFORMATION: a, c, t, g, other or unknown 

<220> FEATURE: 

<221> NAME /KEY: modifiedjsase 

<2 22> LOCATION: (12) 

<223> OTHER INFORMATION: a, c, t, q, other or unknown 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 

<400> SEQUENCE: 19 



gtngcnswnc cnswccayta ytg 



23 



<210> SEQ ID NO 20 
<211> LENGTH: 22 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 20 



gaatgtaggt ccaatttttg gc 



22 
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<210> SEQ ID NO 21 

<2U> LENGTH: 22 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 

<400> SEQUENCE: 21 

cctgtccgaa catcttgaac tg 22 



<210> SEQ ID NO 22 

<211> LENGTH: 23 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: 
<220> FEATURE: 

<221> NAME/KEY: modified.base 
<222> LOCATION: (6) 
<223> OTHER INFORMATION: a, C, 
<220> FEATURE: 

<221> NAME /KEY: modifiedUbase 
<222> LOCATION: (9) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME/KEY: modified.base 
<222> LOCATION: (12) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME/KEY: modified_base 
<222> LOCATION: (18) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME/KEY: modifiedj3aBe 
<222> LOCATION: (21) 
<223> OTHER INFORMATION: a, C, 



t, g, other or unlcnown 



t, other or unknown 



t, g, other or unknovm 



t, g, other or unknown 



t, g, other or unknown 



<400> SEQUENCE: 22 
arraansvmg gngcvmangt nsw 



23 



<210> SEQ ID NO 23 
<211> LENGTH: 23 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 

<220> FEATURE: 

<221> NAME/KEY: modified.base 
<222> LOCATION: (9) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME/KEY: modified_base 
<222> LOCATION: (11) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME/KEY: modified.base 
<222> LOCATION: (15) 
<223> OTHER INFORMATION: a, c, 
<220> FEATURE: 

<221> NAME /KEY: modified.base 
<222> LOCATION: (21) 
<223> OTHER INFORMATION: a, c, 



t, g, other or unknown 



t, g, other or unknown 



t, q, other or unknown 



t, q, other or unknown 



<400> SEQUENCE: 23 
tayaayggng tngcngargt naa 



23 



<210> SEQ ID NO 24 

<211> LENGTH: 22 

<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
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<400> SEQUENCE: 24 

ccgaccatct tgtttgatta ac 22 

<210> SEQ ID NO 25 
<211> LENGTH: 24 
<212> TYPE: DNA 

<213> ORGANISM: ArtiScial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<400> SEQUENCE: 25 

aaytataayg gygttgcryg aagt 24 



<210> SEQ ID NO 26 
<211> LENGTH: 21 
<212> TYPE: DNA 

<213> ORGANISM: Artificial Sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Description of Artificial Sequence: Primer 
<220> FEATURE: 

<221> NAME/KEY: modified.base 
<222> LOCATION: (9) 

<223> OTHER INFORMATION: a, c, t, g, other or unknovn 
<400> SEQUENCE: 26 

taccgnwsnc tacttcaact t 21 



<210> SEQ ID NO 27 
<2ll> LENGTH: 17 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 
<400> SEQUENCE: 27 

Tyr Asn Gly Val Ala Glu Val Lys Lys Arg Gly Tyr Phe Tyr Ala Arg 
15 10 15 

Thr 



<210> SEQ ID NO 28 
<211> LENGTH: 17 
<212> TYPE: PRT 

<213> ORGANISM: Lactobacillus reuteri 
<4 00> SEQUENCE: 28 

Tyr Asn Gly Val Ala Glu Val Asn Thr Glu Arg Gin Ala Asn Gly Gly 
15 10 15 

He 



<210> SEQ ID NO 29 
<211> LENGTH: 14 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus amyloliguef aciens 
<400> SEQUENCE: 29 

Gly Leu Asp Val Trp Asp Ser Trp Fro Leu Gin Asn Ala Asp 
15 10 



<210> SEQ ID NO 30 
<211> LENGTH: 14 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus subtilis 
<400> SEQUENCE: 30 



Gly Leu Asp Val Trp Asp Ser Trp Pro Leu Gin Asn Ala Asp 
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-continued 



1 



5 



10 



<210> SEQ ID NO 31 
<211> LENGTH: 14 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus nutans 
<400> SEQUENCE: 31 

Asp Leu Asp Val Trp Asp Ser Trp Pro Val Gin Asp Ala Lys 
1 5 10 



<210> SEQ ID NO 3 2 
<211> LENGTH: 14 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus salivarius 
<400> SEQUENCE: 32 

Glu lie Asp Val Trp Asp Ser Trp Pro Val Gin Asp Ala Lys 
1 5 . 10 



<210> SEQ ID NO 33 
<211> LENGTH: 16 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus amyloliquef aciens 
<400> SEQUENCE: 33 

Gin Thr Gin Glu Trp Ser Gly Ser Ala Thr Phe Thr Ser Asp Gly Lys 
15 10 15 



<210> SEQ ID NO 34 
<211> LENGTH: 16 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus eubtilis 
<400> SEQUENCE: 34 

Gin Thr Gin Glu Trp Ser Gly Ser Ala Thr Phe Thr Ser Asp Gly Lys 
15 10 15 



<210> SEQ ID NO 35 
<211> LENGTH: 16 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus mutans 
<400> SEQUENCE: 35 

Leu Thr Gin Glu Trp Ser Gly Ser Ala Thr Val Asn Glu Asp Gly Ser 
15 10 15 



<210> SEQ ID NO 36 
<211> LENGTH; 16 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus salivarius 
<400> SEQUENCE: 36 

Asp Asp Gin Gin Trp Ser Gly Ser Ala Thr Val Asn Ser Asp Gly Ser 
15 10 15 



<210> SEQ ID NO 37 
<211> LENGTH: 11 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus amyloliquef aciens 

<400> SEQUENCE: 3 7 



Lys Ala Thr Phe Gly Pro Ser Phe Leu Met Asn 
15 10 
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<210> SEQ ID NO 38 
<211> LENGTH: 11 
<212> TYPE: PRT 

<213> ORGANISM: Bacillus subtilis 
<400> SEQUENCE: 38 

Gin Ser Thr Phe Ala Fro Ser Phe Leu Leu Asn 
15 10 



<2i0> SEQ ID NO 39 
<211> LENGTH: 11 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus mutans 
<400> SEQUENCE: 39 

Asn Ser Thr Trp Ala Pro Ser Phe Leu lie Gin 
15 10 



<210> SEQ ID NO 4 0 
<211> LENGTH: 11 
<212> TYPE: PRT 

<213> ORGANISM: Streptococcus salivarius 
<400> SEQUENCE: 40 

Lys Ser Thr Trp Ala Fro Ser Phe Leu lie Lys 
1 5 10 



What is claimed is: 

1. A process of producing a fructo-oligosaccharide or 
fructo-polysaccharide, having P(2-l) linked D-fructosyl 
units comprising forming a mixture by combining sucrose 
with at least one reaction partner selected from the group 
consisting of: 

a) a protein having fructosyltransferase activity, which 
exhibits at least 85% amino acid identity, as determined 
by a BLAST algorithm, with an amino acid sequence of 
SEQ ID No. 1, and 

b) a recombinant host cell containing one or more copies 
of a nucleic acid construct encoding for said protein (a) 
and capable of expressing said protein; 

wherein said reaction partner interacts with sucrose to 
produce a fructo-oligosaccharide or fructo- 
polysaccharide. 

2. The process according to claim 1, wherein said protein 
is a recombinant protein. 

3. A process according to claim 1, further comprising 
chemically modifying said frusto -oligosaccharide or frusto- 
polysaccharide by simultaneous 3- and 4-oxidation, 1-or 
6-oxidation, phosphorylation, acylation, hydroxyalkylation, 
carboxymethylation or amino-alkylation of one or more 
anhydrofructose units, or by hydrolysis. 

4. The process according to claim 1, further comprising 
adding a food or beverage composition to said mixture to 
obtain a prebiotic composition. 

5. 'ITie process according to claim 1, further comprising 
adding to said mixture a Lactobacillus strain capable of 
producing an oligosaccharide or polysaccharide and option- 
ally a food or beverage composition, to obtain a synbiotic 
composition. 



6. A process of producing a fructo-oligosaccharide or 
fructo-polysaccharide, having p(2-l) hnked D-fructosyl 
units comprising combining sucrose and a protein to form a 

35 mixture, said protein having fructosyltransferase activity, 
which exhibits at least 85% amino acid identity, as deter- 
mined by a BLAST algorithm, with an amino acid sequence 
of SEQ ID No. 1, and 

interacting said sucrose with said protein to produce said 
fruco-oligosaccharide or fructo-polysaccharide. 

7. A process for producing a fructo-oligosaccharide or 
fructo-polysaccharide, having P(2-6) linked D-fructosyl 
units comprising forming a mixture by combining sucrose 

45 with a reaction partner, wherein said reaction partner is a 
recombinant host cell containing one or more copies of a 
nucleic acid construct encoding for a protein having fruc- 
tosyltransferase activity, which exhibits at least 85% amino 
acid identity, as determined by a BLAST algorithm, with an 
amino acid sequence of SEQ ID No. 11, and wherein said 
reaction partner interacts with sucrose to provide a fructo- 
oligosaccharide or fructo-polysaccharide. 

8. A process according to claim 7, further comprising 
55 chemically modifying said fructo-oligosaccharide or fructo- 
polysaccharide by simultaneous 3- and 4-oxidatioD, 1-or 
6-oxidation, phosphorylation, acylation, hydroxyalkylation, 
carboxymethylation or amino-alkylation of one or more 
anhydrofructose units, or by hydrolysis. 

60 
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